DOI QR코드

DOI QR Code

Research on Designing Korean Emotional Dictionary using Intelligent Natural Language Crawling System in SNS

SNS대상의 지능형 자연어 수집, 처리 시스템 구현을 통한 한국형 감성사전 구축에 관한 연구

  • 이종화 (동의대학교 e비즈니스학과)
  • Received : 2020.08.19
  • Accepted : 2020.09.29
  • Published : 2020.09.30

Abstract

Purpose The research was studied the hierarchical Hangul emotion index by organizing all the emotions which SNS users are thinking. As a preliminary study by the researcher, the English-based Plutchick (1980)'s emotional standard was reinterpreted in Korean, and a hashtag with implicit meaning on SNS was studied. To build a multidimensional emotion dictionary and classify three-dimensional emotions, an emotion seed was selected for the composition of seven emotion sets, and an emotion word dictionary was constructed by collecting SNS hashtags derived from each emotion seed. We also want to explore the priority of each Hangul emotion index. Design/methodology/approach In the process of transforming the matrix through the vector process of words constituting the sentence, weights were extracted using TF-IDF (Term Frequency Inverse Document Frequency), and the dimension reduction technique of the matrix in the emotion set was NMF (Nonnegative Matrix Factorization) algorithm. The emotional dimension was solved by using the characteristic value of the emotional word. The cosine distance algorithm was used to measure the distance between vectors by measuring the similarity of emotion words in the emotion set. Findings Customer needs analysis is a force to read changes in emotions, and Korean emotion word research is the customer's needs. In addition, the ranking of the emotion words within the emotion set will be a special criterion for reading the depth of the emotion. The sentiment index study of this research believes that by providing companies with effective information for emotional marketing, new business opportunities will be expanded and valued. In addition, if the emotion dictionary is eventually connected to the emotional DNA of the product, it will be possible to define the "emotional DNA", which is a set of emotions that the product should have.

Keywords

References

  1. 강주연, 이이든, 김지수, "텍스트 마이닝을 활용한 'Z 세대'관련 뉴스데이터 의미연결망 분석," 미래청소년학회지, 제17권, 2020, pp. 25-48.
  2. 고흥석, 신중현, "디지털 네이티브 세대의 미디어 이용행태에 관한 탐색적 연구," 한국콘텐츠학회논문지, 제18권, 제3호, 2018, pp. 1-10. https://doi.org/10.5392/JKCA.2018.18.03.001
  3. 권종원, 송태승, "제조 혁신 위한 플랫폼 기반의 디지털 트랜스포메이션 추진 동향," 전자공학회지, 제46권, 제12호, 2019, pp. 34-46.
  4. 김철원, 박선, "의미특징과 워드넷 기반의 의사 연관 피드백을 사용한 질의기반 문서요약," 한국정보통신학회논문지 제15권, 제7호, 2011, pp. 1517-1524. https://doi.org/10.6109/jkiice.2011.15.7.1517
  5. 박선, 김경준, 김경호, 이성로 "의미특징 기반의 용어 가중치 재산정을 이용한 문서군집의 성능 향상," 한국정보통신학회논문지, 제17권, 제2호, 2013, pp. 347-354. https://doi.org/10.6109/jkiice.2013.17.2.347
  6. 이종화, "Python을 이용한 SNS 크롤링 시스템 구축," 한국산업정보학회논문지, 제23권, 제5호, 2018, pp. 61-76. https://doi.org/10.9723/JKSIIS.2018.23.5.061
  7. 이종화, 이문봉, 김종원, "TF-IDF 를 활용한 한글 자연어 처리 연구," 정보시스템연구 제28권, 제3호, 2019, pp. 105-121.
  8. 이종화, 이윤재, 이현규, "SNS 의 해시태그를 이용한 감정 단어 수집 시스템 개발," 정보시스템연구, 제27권, 제2호, 2018, pp. 77-94.
  9. 이종화, "SNS 해시태그를 이용한 감정 단어 일반화 연구," 인터넷전자상거래연구 제18궈, 제4호, 2018, pp. 53-63.
  10. 이진수, "데이터 사이언스 기반의 디지털 트랜스포메이션," 방송과 미디어 제22권, 제4호, 2017, pp. 18-25.
  11. Balahur, A., Hermida, J. M., and Montoyo, A., "Detecting implicit expressions of emotion in text: A comparative analysis," Decision Support Systems, Vol. 53, No. 4, 2012, pp. 742-753. https://doi.org/10.1016/j.dss.2012.05.024
  12. Cheng, F., Shen, J., Yu, Y., Li, W., Liu, G., Lee, P. W., and Tang, Y., "In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods," Chemosphere, Vol. 82, No. 11, 2011, pp. 1636-1643. https://doi.org/10.1016/j.chemosphere.2010.11.043
  13. Christian, H., Agus, M. P., and Suhartono, D., "Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TFIDF)," ComTech: Computer, Mathematics and Engineering Applications, Vol. 7, No. 4, 2016, pp. 285-294. https://doi.org/10.21512/comtech.v7i4.3746
  14. Danielsson, P. E., "Euclidean distance mapping," Computer Graphics and image processing, Vol. 14, No. 3, 1980, pp. 227-248. https://doi.org/10.1016/0146-664X(80)90054-4
  15. Gunasekaran, S., "Computer vision technology for food quality assurance," Trends in Food Science & Technology, Vol. 7, No. 8, 1996, pp. 245-256. https://doi.org/10.1016/0924-2244(96)10028-5
  16. Kaelbling, L. P., Littman, M. L., and Moore, A. W., "Reinforcement learning: A survey," Journal of artificial intelligence research, Vol. 4, 1996, pp. 237-285. https://doi.org/10.1613/jair.301
  17. Plutchik, R., A general psychoevolutionary theory of emotion, In Theories of emotion, 1980.
  18. Qaiser, S., and Ali, R., "Text mining: use of TF-IDF to examine the relevance of words to documents," International Journal of Computer Applications, Vol. 181, No. 1, 2018, pp. 25-29. https://doi.org/10.5120/ijca2018917395
  19. Salton, G., and Buckley, C., "Term-weighting approaches in automatic text retrieval," Information processing & management, Vol. 24, No. 5, 1988, pp. 513-523. https://doi.org/10.1016/0306-4573(88)90021-0
  20. Seung, D., and Lee, L., "Algorithms for non-negative matrix factorization," Advances in neural information processing systems, Vol. 13, 2001, pp. 556-562.
  21. Tata, S., and Patel, J. M., "Estimating the selectivity of tf-idf based cosine similarity predicates," ACM Sigmod Record, Vol. 36, No. 2, 2007, pp. 7-12. https://doi.org/10.1145/1328854.1328855
  22. Walker, M. A., Anand, P., Abbott, R., Tree, J. E. F., Martell, C., and King, J., "That is your evidence?: Classifying stance in online political debate," Decision Support Systems, Vol. 53, No. 4, 2012, pp. 719-729. https://doi.org/10.1016/j.dss.2012.05.032
  23. Witten, I. H., Frank, E., Hall, M. A., and Pal, C. J., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, 2016.
  24. Ye, C., Yung, N. H., and Wang, D., "A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance," IEEE Transactions on Systems, Vol. 33, No. 1, 2003, pp. 17-27.
  25. Ye, J., "Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses," Artificial intelligence in medicine, Vol. 63, No. 3, 2015, pp. 171-179. https://doi.org/10.1016/j.artmed.2014.12.007