DOI QR코드

DOI QR Code

Feasibility of Deep Learning Algorithms for Binary Classification Problems

이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가

  • 김기태 (한양대학교 일반대학원 경영학과) ;
  • 이보미 (한양대학교 일반대학원 비즈니스인포매틱스학과) ;
  • 김종우 (한양대학교 경영대학 경영학부)
  • Received : 2016.11.18
  • Accepted : 2017.01.09
  • Published : 2017.03.31

Abstract

Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

최근 알파고의 등장으로 딥러닝 기술에 대한 관심이 고조되고 있다. 딥러닝은 향후 미래의 핵심 기술이 되어 일상생활의 많은 부분을 개선할 것이라는 기대를 받고 있지만, 주요한 성과들이 이미지 인식과 자연어처리 등에 국한되어 있고 전통적인 비즈니스 애널리틱스 문제에의 활용은 미비한 실정이다. 실제로 딥러닝 기술은 Convolutional Neural Network(CNN), Recurrent Neural Network(RNN), Deep Boltzmann Machine (DBM) 등 알고리즘들의 선택, Dropout 기법의 활용여부, 활성 함수의 선정 등 다양한 네트워크 설계 이슈들을 가지고 있다. 따라서 비즈니스 문제에서의 딥러닝 알고리즘 활용은 아직 탐구가 필요한 영역으로 남아있으며, 특히 딥러닝을 현실에 적용했을 때 발생할 수 있는 여러 가지 문제들은 미지수이다. 이에 따라 본 연구에서는 다이렉트 마케팅 응답모델, 고객이탈분석, 대출 위험 분석 등의 주요한 분류 문제인 이진분류에 딥러닝을 적용할 수 있을 것인지 그 가능성을 실험을 통해 확인하였다. 실험에는 어느 포르투갈 은행의 텔레마케팅 응답여부에 대한 데이터 집합을 사용하였으며, 전통적인 인공신경망인 Multi-Layer Perceptron, 딥러닝 알고리즘인 CNN과 RNN을 변형한 Long Short-Term Memory, 딥러닝 모형에 많이 활용되는 Dropout 기법 등을 이진 분류 문제에 활용했을 때의 성능을 비교하였다. 실험을 수행한 결과 CNN 알고리즘은 비즈니스 데이터의 이진분류 문제에서도 MLP 모형에 비해 향상된 성능을 보였다. 또한 MLP와 CNN 모두 Dropout을 적용한 모형이 적용하지 않은 모형보다 더 좋은 분류 성능을 보여줌에 따라, Dropout을 적용한 CNN 알고리즘이 이진분류 문제에도 활용될 수 있는 가능성을 확인하였다.

Keywords

References

  1. Ahn, S. M., "Deep learning architectures and applications," Journal of Intelligence and Information, Vol. 22, No. 2(2016), 127-142. https://doi.org/10.13088/jiis.2016.22.2.127
  2. Cho, K., B. Van Merrienboer, D. Bahdanau, and Y. Bengio, "On the properties of neural machine translation: Encoder-decoder approaches," arXiv preprint arXiv:1409.1259, 2014.
  3. Choi, H. Y., and Y. H. Min, "Introduction to deep learning and major issues[written in Korean]," Korea Information Processing Society Review, Vol. 22, No. 1(2015), 1-15.
  4. Chu, H. S., S. W. Ahn, and S. W. Kim, AlphaGo's artificial intelligence algorithm analysis [written in Korean], Software Policy & Research Institute, 2016.
  5. Fukushima, K., "Neocongnitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position," Biological cybernetics, Vol. 36, No. 4(1980), 193-202. https://doi.org/10.1007/BF00344251
  6. Graves, A., A. R. Mohamed, and G. E. Hinton, "Speech recognition with deep recurrent neural networks," 2013 IEEE international conference on acoustics, speech and signal processing, (2013), 6645-6649.
  7. Hinton, G. E., S. Osinder, and Y. W. Teh, "A fast learning algorithm for deep belief nets," Neural computation, Vol. 18, No. 7(2006), 1527-1554. https://doi.org/10.1162/neco.2006.18.7.1527
  8. Hochreiter, S., and S. Jurgen, "Long short-term memory," Neural computation, Vol. 9, No. 8(1997), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
  9. Jo, N. O., H. J. Kim, and K. S. Shin, "Bankruptcy type prediction using a hybrid artificial neural networks model," Journal of Intelligence and Information, Vol. 21, No. 3(2015), 79-99.
  10. Kim, H. J., "Dynamic hand gesture recognition using CNN model and FMM neural networks," Journal of Intelligence and Information, Vol. 16, No. 2(2010), 95-108.
  11. Kim, J. W., H. A. Pyo, J. W. Ha, C. K. Lee, and J. H. Lee, "Deep learning algorithms and applications," Communications of the Korean Institute of Information Scientists and Engineers, Vol. 33, No. 8(2015), 25-31.
  12. Kim, K. T., "Perchase prediction through clickstream data of internet store based on deep learning technique," Master's Thesis, Graduate School, Hanyang University, 2016a.
  13. Kim, U. J., Introduction to artificial intelligence, machine learning, and deep learning with algorithms[written in Korean], wikibook, Paju, Republic of Korea, 2016b.
  14. Krizhevsky, A., I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolution neural networks," Advances in neural information processing systems, Vol. 25(2013), 1097-1105.
  15. LeCun, Y., B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural computation, Vol. 1, No. 4(1989), 541-551. https://doi.org/10.1162/neco.1989.1.4.541
  16. LeCun, Y., Y. Bengio, and G. Hinton, "Deep learning," Nature, Vol. 521, No. 7553(2015), 436-444. https://doi.org/10.1038/nature14539
  17. Matsuo, Y., Artificial intelligence and deep learning(Translated by Park, K. W.)[written in Korean], Donga M&B, Seoul, Republic of Korea, 2015.
  18. Srivastava, N., G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," Journal of Machine Learning Research, Vol. 15, No. 1(2014), 1929-1958.
  19. Zhang, B. T., "Deep hypernetwork models," Communications of the Korean Institute of Information Scientists and Engineers, Vol. 33, No. 8(2015), 11-24.

Cited by

  1. RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구 vol.23, pp.3, 2017, https://doi.org/10.13088/jiis.2017.23.3.139
  2. A Study on the Algorithm for Determining Back Bead Generation in GMA Welding Using Deep Learning vol.36, pp.2, 2017, https://doi.org/10.5781/jwj.2018.36.2.11
  3. A case study on the math and manipulative activity of five-year-old children according to the activity theory vol.38, pp.2, 2018, https://doi.org/10.18023/kjece.2018.38.2.004
  4. 비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로 vol.24, pp.2, 2018, https://doi.org/10.13088/jiis.2018.24.2.221
  5. 빅데이터 군집 분석을 이용한 학습성취도 예측 - 종단 연구를 중심으로 vol.19, pp.9, 2017, https://doi.org/10.9728/dcs.2018.19.9.1769