Korean and English Sentiment Analysis Using the Deep Learning

  • Received : 2018.05.18
  • Accepted : 2018.06.27
  • Published : 2018.06.30


Social media has immense popularity among all services today. Data from social network services (SNSs) can be used for various objectives, such as text prediction or sentiment analysis. There is a great deal of Korean and English data on social media that can be used for sentiment analysis, but handling such huge amounts of unstructured data presents a difficult task. Machine learning is needed to handle such huge amounts of data. This research focuses on predicting Korean and English sentiment using deep forward neural network with a deep learning architecture and compares it with other methods, such as LDA MLP and GENSIM, using logistic regression. The research findings indicate an approximately 75% accuracy rate when predicting sentiments using DNN, with a latent Dirichelet allocation (LDA) prediction accuracy rate of approximately 81%, with the corpus being approximately 64% accurate between English and Korean.


Supported by : National Research Foundation of Korea


  1. Lavanya, K. and Deisy, C., "Twitter Sentiment Analysis using Multi-Class SVM," 2017 International Conference on Intelligent Computing and Control (I2C2), Coimbatore, pp. 1-6, 2017.
  2. Joshi, R. and Tekchandani, R., "Comparative Analysis of Twitter Data using Supervised Classifiers," 2016 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, pp. 1-6, 2016.
  3. Ramadhani, R. A., Indriani, F., and Nugrahadi, D. T., "Comparison of Naive Bayes Smoothing Methods for Twitter Sentiment Analysis," 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Malang, pp. 287-292, 2016.
  4. Duncan, B., and Zhang, Y., "Neural Networks For Sentiment Analysis on Twitter," 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), Beijing, pp. 275-278, 2015.
  5. Lee, J. H. and Lee, H. K., “A Study on Unstructured Text Mining Algorithm through R Programming based on Data Dictionary,” Journal of the Korea Society Industrial Information System, Vol. 20, No. 2, pp. 113-12, 2015.
  6. Yun, B. H., “Natural Language Processing-based Information Extraction for Newspapers,” Journal of Korean Institute of Information Technology, Vol. 6, No. 4, pp. 188-195, 2008.
  7. Chen, M. H., Chen, W. F., and Ku, L. W., "Application of Sentiment Analysis to Language Learning," in IEEE Access, Vol. 6, pp. 24433-24442, 2018.
  8. Day, M. Y., and Lin, Y. D., "Deep Learning for Sentiment Analysis on Google Play Consumer Review," 2017 IEEE International Conference on Information Reuse and Integration (IRI), San Diego, CA, pp. 382-388, 2017.
  9. Hassan, A., and Mahmood, A., "Deep Learning Approach for Sentiment Analysis of Short Texts," 2017 3rd International Conference on Control, Automation and Robotics (ICCAR), Nagoya, pp. 705-710, 2017.
  10. Jianqiang, Z., Xiaolin, G., and Xuejun, Z., "Deep Convolution Neural Networks for Twitter Sentiment Analysis," in IEEE Access, Vol. 6, pp. 23253-23260, 2018.
  11. Deng, L. and Dong, Y., "Deep Learning: Methods and Applications," NOW Publishers, United State of America, 2014.
  12. Aaron, Basic Korean Sentence Structure, 2014. [Online]. Available at [Accessed 20 May 2017].
  13. Vidhya Content Team, Quick Guide: Steps to Perform Text Data Cleaning in Python, 2015. [Online]. Available at [Accessed 20 May 2017].
  14. Tomar, S.S., Text mining in R: A Tutorial, 2017 [Online]. Available at [Accessed 20 May 2017].
  15. Yuhang, Z., Yue, W., and Wei, Y., "Research on Data Cleaning in Text Clustering," 2010 International Forum on Information Technology and Applications, Kunming, pp. 305-307, 2010.
  16. Github, Twitter-Korean-text, 2014. [Online]. Available at [Accessed 20 May 2017].
  17. Quora, What Are All The Speech Levels of Korean and How Are They Used?, 2012. [Online]. Available at [Accessed 20 May 2017].
  18. Miachel, R0., 3 Steps of Text Mining, 2012. [Online]. Available at [Accessed 20 May 2017].
  19. GoodfellowI, Y., Bengio, and Courville, B., 2016. Deep Learning. MIT Press [Online]. Available at [Accessed 20 May 2017].
  20. Nielsen, M., Using Neural Nets to Recognize Handwritten Digits, 2017. [Online]. Available at [Accessed 20 May 2017].
  21. Ruder, S., An Overview of Gradient Descent Optimization Algorithms, 2016. [Online]. Available at [Accessed 20 May 2017].
  22. Scikit Learn Team. 2016. Stochastic Gradient Descent [Online]. Available at [Accessed 20 May 2017].
  23. Blei, D. M., Ng, A. Y., and Jordan, M. I., “Latent Dirichlet Allocation,” Journal of Machine Learning Research, Vol. 3, No. 5, pp. 993-1022, 2003.
  24. Wang, D., Thint, M., and Al-Rubaie. A., "Semi-Supervised Latent Dirichlet Allocation and Its Application for Document Classification," 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 306-310, 2012.
  25. Yee, C. S. and Ahmad, A. M., "Malay Language Text-Independent Speaker Verification using Nn-Mlp Classifier with Mfcc," 2008 International Conference on Electronic Design, Penang, pp. 1-5, 2008.
  26. Karim, M., Deep Learning via Multilayer Perceptron Classifier - Dzone Big Data, 2018. [Online]. Available at [Accessed 13 June 2018].
  27. Barde, B. V. and Bainwad, A. M., "An Overview of Topic Modeling Methods and Tools," 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, pp. 745-750, 2017.
  28. Ahn, H., “A Study on Compression of Connections in Deep Artificial Neural Networks,” Journal of the Korea Industrial Information Systems Research, Vol. 22, No. 5, pp. 17-24, 2017.
  29. Nalini, S. Sandhya, K. Ganesh Kumar, P., "Enhancing Gender Classification in Social Networks," 2014 The International Industrial Information Systems Conference, pp. 251-256, 2014.