A Malicious Comments Detection Technique on the Internet using Sentiment Analysis and SVM

감성분석과 SVM을 이용한 인터넷 악성댓글 탐지 기법

  • Received : 2015.12.23
  • Accepted : 2016.01.27
  • Published : 2016.02.29


The Internet has brought lots of changes to us sharing information mutually. However, as all social symptom have double-sided character, it has serious social problem. Vicious users have been taking advantage of anonymity on the Internet, stating comments aggressively for defamation, personal attacks, privacy violation and more. Malicious comments on the Internet are creating the biggest problem regarding unlawful acts and insults which occur on the Internet. In order to solve the issues, several studies have been done to efficiently manage the comments. However, there are limitations to recognize modified malicious vocabulary in previous research. So, in this paper, we propose a malicious comments detection technique by improving limitation of previous studies. The experimental result has shown accuracy of 87.8% providing higher accuracy as compared to previous studies done.


data mining;malicious comments;SVM;sentiment analysis;Korean normalization


  1. Korean Internet & Security Agency, "Internet Use Survey Summary Report," Korean Internet & Security Agency (KISA), 2014.
  2. Comments, [Internet]. Available:
  3. E. J. No, "The Constitutional Study on Internet Comments," a master's thesis SungKyunKwan University, Aug. 2014.
  4. Prosecution service, Internet malicious comments illegal act processing method implementation press release, Apr. 2015.
  5. S. S. Kang, "A Normalization Method of Distorted Korean SMS Sentences for Spam Message Filtering," Korea Information Processing Society, vol. 3, no. 7, pp.271-276, Jul. 2014.
  6. K. S. Shim and J. H. Yang, "High Speed Korean Morphological Analysis based on Adjacency Condition Check," Korean Institute of Information Scientists and Engineers, vol. 31, no. 1, pp.89-99, Jan. 2004.
  7. J. S. Song and S. W. Lee, "Automatic Construction of Positive/Negative Feature-Predicate Dictionary for Polarity Classification of Product Reviews," Korean Institute of Information Scientists and Engineers, vol. 38, no. 3, pp.157-168, Mar. 2011.
  8. S. W. Kim and N. K. Kim, "A Study on the Effect of Using Sentiment Lexicon in Opinion Classification," Korea Intelligent Information System Society, vol. 20, no. 1, pp.133-148, Mar. 2014.
  9. E. J. You, Y. S. Kim, N. K. Kim and S, Y. Jung, "Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary," Korea Intelligent Information System Society, vol. 19, no. 1, pp.95-110, Mar. 2013.
  10. Corinna Cortes and Vladimir Vapnik, "Support vector networks," Machine Learning 20, pp.273-297, 1995.
  11. M. S. Kim and S. S. Kang, "A Design and Implementation of Malicious Web Log Identification System by Using SVM," 18st Annual Conference on Human and Language Technology, pp.285-289, Oct. 2006.
  12. M. Y. Bae and J. W. Cha, "Comments Classification System using Topic Signature," Korean Institute of Information Scientists and Engineers, vol. 35, no. 12, pp.774-779, Dec. 2008.
  13. H. J. Kim, Y. M. Yoon and B. M. Lee, "Prediction System for Abusive Postings using Enhanced FFP," Journal of Korean Institute of Information Technology, vol. 9, no. 1. pp.207-216, Jan. 2011.
  14. the fancake, [Internet]. Available:
  15. K. H. Joe, "A Study Text Typological of Internet Comments," The Textlinguistic Society of Korea, vol. 23, pp.203-230, Nov. 2007.
  16. S. S. Kang and K. B. Hwang, "A Language Independent n-gram Model for Word Segmentation," Advances in Artificial Intelligence 2006, vol. 4303, pp.557-565, Dec. 2006.