DOI QR코드

DOI QR Code

Korean and English Sentiment Analysis Using the Deep Learning

  • Received : 2018.05.18
  • Accepted : 2018.06.27
  • Published : 2018.06.30

Abstract

Social media has immense popularity among all services today. Data from social network services (SNSs) can be used for various objectives, such as text prediction or sentiment analysis. There is a great deal of Korean and English data on social media that can be used for sentiment analysis, but handling such huge amounts of unstructured data presents a difficult task. Machine learning is needed to handle such huge amounts of data. This research focuses on predicting Korean and English sentiment using deep forward neural network with a deep learning architecture and compares it with other methods, such as LDA MLP and GENSIM, using logistic regression. The research findings indicate an approximately 75% accuracy rate when predicting sentiments using DNN, with a latent Dirichelet allocation (LDA) prediction accuracy rate of approximately 81%, with the corpus being approximately 64% accurate between English and Korean.

Acknowledgement

Supported by : National Research Foundation of Korea

References

  1. Lavanya, K. and Deisy, C., "Twitter Sentiment Analysis using Multi-Class SVM," 2017 International Conference on Intelligent Computing and Control (I2C2), Coimbatore, pp. 1-6, 2017.
  2. Joshi, R. and Tekchandani, R., "Comparative Analysis of Twitter Data using Supervised Classifiers," 2016 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, pp. 1-6, 2016.
  3. Ramadhani, R. A., Indriani, F., and Nugrahadi, D. T., "Comparison of Naive Bayes Smoothing Methods for Twitter Sentiment Analysis," 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Malang, pp. 287-292, 2016.
  4. Duncan, B., and Zhang, Y., "Neural Networks For Sentiment Analysis on Twitter," 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), Beijing, pp. 275-278, 2015.
  5. Lee, J. H. and Lee, H. K., “A Study on Unstructured Text Mining Algorithm through R Programming based on Data Dictionary,” Journal of the Korea Society Industrial Information System, Vol. 20, No. 2, pp. 113-12, 2015. https://doi.org/10.9723/jksiis.2015.20.2.113
  6. Yun, B. H., “Natural Language Processing-based Information Extraction for Newspapers,” Journal of Korean Institute of Information Technology, Vol. 6, No. 4, pp. 188-195, 2008.
  7. Chen, M. H., Chen, W. F., and Ku, L. W., "Application of Sentiment Analysis to Language Learning," in IEEE Access, Vol. 6, pp. 24433-24442, 2018. https://doi.org/10.1109/ACCESS.2018.2832137
  8. Day, M. Y., and Lin, Y. D., "Deep Learning for Sentiment Analysis on Google Play Consumer Review," 2017 IEEE International Conference on Information Reuse and Integration (IRI), San Diego, CA, pp. 382-388, 2017.
  9. Hassan, A., and Mahmood, A., "Deep Learning Approach for Sentiment Analysis of Short Texts," 2017 3rd International Conference on Control, Automation and Robotics (ICCAR), Nagoya, pp. 705-710, 2017.
  10. Jianqiang, Z., Xiaolin, G., and Xuejun, Z., "Deep Convolution Neural Networks for Twitter Sentiment Analysis," in IEEE Access, Vol. 6, pp. 23253-23260, 2018. https://doi.org/10.1109/ACCESS.2017.2776930
  11. Deng, L. and Dong, Y., "Deep Learning: Methods and Applications," NOW Publishers, United State of America, 2014.
  12. Aaron, Basic Korean Sentence Structure, 2014. [Online]. Available at http://keytokorean.com/classes/beginner/basic-korean-sentence-structure/ [Accessed 20 May 2017].
  13. Vidhya Content Team, Quick Guide: Steps to Perform Text Data Cleaning in Python, 2015. [Online]. Available at https://www.analyticsvidhya.com/blog/2015/06/quick-guide-text-data-cleaninGoodfellow-et-al-2016 [Accessed 20 May 2017].
  14. Tomar, S.S., Text mining in R: A Tutorial, 2017 [Online]. Available at https://www.springboard.com/blog/text-mining-in-r/ [Accessed 20 May 2017].
  15. Yuhang, Z., Yue, W., and Wei, Y., "Research on Data Cleaning in Text Clustering," 2010 International Forum on Information Technology and Applications, Kunming, pp. 305-307, 2010.
  16. Github, Twitter-Korean-text, 2014. [Online]. Available at https://github.com/twitter/twitter-korean-text [Accessed 20 May 2017].
  17. Quora, What Are All The Speech Levels of Korean and How Are They Used?, 2012. [Online]. Available at https://www.quora.com/What-are-all-the-speech-levels-of-Korean-and-how-are-they-used [Accessed 20 May 2017].
  18. Miachel, R0., 3 Steps of Text Mining, 2012. [Online]. Available at http://www2.cs.man.ac.uk/-raym8/comp38212/main/node203.html [Accessed 20 May 2017].
  19. GoodfellowI, Y., Bengio, and Courville, B., 2016. Deep Learning. MIT Press [Online]. Available at http://www.deeplearningbook.org [Accessed 20 May 2017].
  20. Nielsen, M., Using Neural Nets to Recognize Handwritten Digits, 2017. [Online]. Available at http://neuralnetworksanddeeplearning.com/chap1.html [Accessed 20 May 2017].
  21. Ruder, S., An Overview of Gradient Descent Optimization Algorithms, 2016. [Online]. Available at http://sebastianruder.com/optimizing-gradient-descent/ [Accessed 20 May 2017].
  22. Scikit Learn Team. 2016. Stochastic Gradient Descent [Online]. Available at http://scikit-learn.org/stable/modules/sgd.html [Accessed 20 May 2017].
  23. Blei, D. M., Ng, A. Y., and Jordan, M. I., “Latent Dirichlet Allocation,” Journal of Machine Learning Research, Vol. 3, No. 5, pp. 993-1022, 2003.
  24. Wang, D., Thint, M., and Al-Rubaie. A., "Semi-Supervised Latent Dirichlet Allocation and Its Application for Document Classification," 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 306-310, 2012.
  25. Yee, C. S. and Ahmad, A. M., "Malay Language Text-Independent Speaker Verification using Nn-Mlp Classifier with Mfcc," 2008 International Conference on Electronic Design, Penang, pp. 1-5, 2008.
  26. Karim, M., Deep Learning via Multilayer Perceptron Classifier - Dzone Big Data, 2018. [Online]. dzone.com. Available at https://dzone.com/articles/deep-learning-viamultilayer-perceptron-classifier [Accessed 13 June 2018].
  27. Barde, B. V. and Bainwad, A. M., "An Overview of Topic Modeling Methods and Tools," 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, pp. 745-750, 2017.
  28. Ahn, H., “A Study on Compression of Connections in Deep Artificial Neural Networks,” Journal of the Korea Industrial Information Systems Research, Vol. 22, No. 5, pp. 17-24, 2017. https://doi.org/10.9723/JKSIIS.2017.22.5.017
  29. Nalini, S. Sandhya, K. Ganesh Kumar, P., "Enhancing Gender Classification in Social Networks," 2014 The International Industrial Information Systems Conference, pp. 251-256, 2014.