DOI QR코드

DOI QR Code

A Malicious Comments Detection Technique on the Internet using Sentiment Analysis and SVM

감성분석과 SVM을 이용한 인터넷 악성댓글 탐지 기법

  • Hong, Jinju (Graduate School of Software, Soongsil University) ;
  • Kim, Sehan (Graduate School of Software, Soongsil University) ;
  • Park, Jeawon (Graduate School of Software, Soongsil University) ;
  • Choi, Jaehyun (Graduate School of Software, Soongsil University)
  • Received : 2015.12.23
  • Accepted : 2016.01.27
  • Published : 2016.02.29

Abstract

The Internet has brought lots of changes to us sharing information mutually. However, as all social symptom have double-sided character, it has serious social problem. Vicious users have been taking advantage of anonymity on the Internet, stating comments aggressively for defamation, personal attacks, privacy violation and more. Malicious comments on the Internet are creating the biggest problem regarding unlawful acts and insults which occur on the Internet. In order to solve the issues, several studies have been done to efficiently manage the comments. However, there are limitations to recognize modified malicious vocabulary in previous research. So, in this paper, we propose a malicious comments detection technique by improving limitation of previous studies. The experimental result has shown accuracy of 87.8% providing higher accuracy as compared to previous studies done.

인터넷을 통해 많은 정보를 얻고 많은 정보를 타인에게 제공하면서 개인의 삶의 양식에 큰 변화를 가져다주었다. 모든 사회 현상에는 양면성이 있듯이 인터넷 익명성을 이용하여 명예훼손, 인신공격, 사생활 침해등과 같이 악의적으로 이용하여 사회적으로 심각한 문제를 양산하고 있다. 인터넷 게시판의 악성댓글은 인터넷에서 발생하는 불법적인 언사나 행위와 관련하여 가장 대두되고 있는 문제이다. 이러한 문제를 해결하기 위해 많은 연구가 진행되고 있지만 악성댓글에 사용된 단어들은 변형이 많이 나타나기 때문에 기존 연구들은 이러한 변형된 악성어휘를 인식하는데 한계점이 존재한다. 이에 본 연구에서는 기존 연구의 한계점을 개선하여 악성댓글을 탐지하는 기법을 제안한다. 실험결과 87.8%의 정확도를 나타냈으며, 이는 기존 연구들에 비해 상당히 발전된 결과로 볼 수 있다.

Keywords

References

  1. Korean Internet & Security Agency, "Internet Use Survey Summary Report," Korean Internet & Security Agency (KISA), 2014.
  2. Comments, [Internet]. Available: https://ko.wikipedia.org/wiki/.
  3. E. J. No, "The Constitutional Study on Internet Comments," a master's thesis SungKyunKwan University, Aug. 2014.
  4. Prosecution service, Internet malicious comments illegal act processing method implementation press release, Apr. 2015.
  5. S. S. Kang, "A Normalization Method of Distorted Korean SMS Sentences for Spam Message Filtering," Korea Information Processing Society, vol. 3, no. 7, pp.271-276, Jul. 2014.
  6. K. S. Shim and J. H. Yang, "High Speed Korean Morphological Analysis based on Adjacency Condition Check," Korean Institute of Information Scientists and Engineers, vol. 31, no. 1, pp.89-99, Jan. 2004.
  7. J. S. Song and S. W. Lee, "Automatic Construction of Positive/Negative Feature-Predicate Dictionary for Polarity Classification of Product Reviews," Korean Institute of Information Scientists and Engineers, vol. 38, no. 3, pp.157-168, Mar. 2011.
  8. S. W. Kim and N. K. Kim, "A Study on the Effect of Using Sentiment Lexicon in Opinion Classification," Korea Intelligent Information System Society, vol. 20, no. 1, pp.133-148, Mar. 2014.
  9. E. J. You, Y. S. Kim, N. K. Kim and S, Y. Jung, "Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary," Korea Intelligent Information System Society, vol. 19, no. 1, pp.95-110, Mar. 2013. https://doi.org/10.13088/jiis.2013.19.1.095
  10. Corinna Cortes and Vladimir Vapnik, "Support vector networks," Machine Learning 20, pp.273-297, 1995.
  11. M. S. Kim and S. S. Kang, "A Design and Implementation of Malicious Web Log Identification System by Using SVM," 18st Annual Conference on Human and Language Technology, pp.285-289, Oct. 2006.
  12. M. Y. Bae and J. W. Cha, "Comments Classification System using Topic Signature," Korean Institute of Information Scientists and Engineers, vol. 35, no. 12, pp.774-779, Dec. 2008.
  13. H. J. Kim, Y. M. Yoon and B. M. Lee, "Prediction System for Abusive Postings using Enhanced FFP," Journal of Korean Institute of Information Technology, vol. 9, no. 1. pp.207-216, Jan. 2011.
  14. the fancake, [Internet]. Available: https://thefancake.co.kr/
  15. K. H. Joe, "A Study Text Typological of Internet Comments," The Textlinguistic Society of Korea, vol. 23, pp.203-230, Nov. 2007.
  16. S. S. Kang and K. B. Hwang, "A Language Independent n-gram Model for Word Segmentation," Advances in Artificial Intelligence 2006, vol. 4303, pp.557-565, Dec. 2006.

Cited by

  1. SVM과 협업적 필터링 기법을 이용한 소비자 맞춤형 시장 분석 기법 설계 vol.9, pp.6, 2016, https://doi.org/10.17661/jkiiect.2016.9.6.609
  2. 부분방전 패턴인식을 위해 EMC센서를 이용한 최적화된 RBFNNs 분류기 설계 vol.66, pp.9, 2016, https://doi.org/10.5370/kiee.2017.66.9.1392
  3. 소셜 미디어 텍스트를 이용한 장소 선호도 분석 기법 vol.25, pp.4, 2016, https://doi.org/10.7319/kogsis.2017.25.4.055
  4. A Comparison Study on Performance of Malicious Comment Classification Models Applied with Artificial Neural Network vol.20, pp.7, 2019, https://doi.org/10.9728/dcs.2019.20.7.1429
  5. Analyzing Dissatisfaction Factors of Weather Service Users Using Twitter and News Headlines vol.15, pp.4, 2016, https://doi.org/10.5392/ijoc.2019.15.4.065
  6. 하이웨이 네트워크 기반 CNN 모델링 및 사전 외 어휘 처리 기술을 활용한 악성 댓글 분류 연구 vol.29, pp.3, 2016, https://doi.org/10.5859/kais.2020.29.3.103
  7. 딥러닝 기술을 활용한 차별 및 혐오 표현 탐지 : 어텐션 기반 다중 채널 CNN 모델링 vol.24, pp.12, 2016, https://doi.org/10.6109/jkiice.2020.24.12.1595