A Machine Learning-Based Vocational Training Dropout Prediction Model Considering Structured and Unstructured Data

정형 데이터와 비정형 데이터를 동시에 고려하는 기계학습 기반의 직업훈련 중도탈락 예측 모형

Ha, Manseok;Ahn, Hyunchul

  • Received : 2018.10.24
  • Accepted : 2018.11.21
  • Published : 2019.01.28


One of the biggest difficulties in the vocational training field is the dropout problem. A large number of students drop out during the training process, which hampers the waste of the state budget and the improvement of the youth employment rate. Previous studies have mainly analyzed the cause of dropouts. The purpose of this study is to propose a machine learning based model that predicts dropout in advance by using various information of learners. In particular, this study aimed to improve the accuracy of the prediction model by taking into consideration not only structured data but also unstructured data. Analysis of unstructured data was performed using Word2vec and Convolutional Neural Network(CNN), which are the most popular text analysis technologies. We could find that application of the proposed model to the actual data of a domestic vocational training institute improved the prediction accuracy by up to 20%. In addition, the support vector machine-based prediction model using both structured and unstructured data showed high prediction accuracy of the latter half of 90%.


Vocational Training;Dropout;Machine Learning;Convolutional Neural Network;Word2vec


Supported by : 한국연구재단