DOI QR코드

DOI QR Code

Dynamic Text Categorizing Method using Text Mining and Association Rule

  • Kim, Young-Wook (School of Industrial and Management Engineering, Korea University) ;
  • Kim, Ki-Hyun (School of Industrial and Management Engineering, Korea University) ;
  • Lee, Hong-Chul (School of Industrial and Management Engineering, Korea University)
  • Received : 2017.11.20
  • Accepted : 2017.12.02
  • Published : 2018.10.31

Abstract

In this paper, we propose a dynamic document classification method which breaks away from existing document classification method with artificial categorization rules focusing on suppliers and has changing categorization rules according to users' needs or social trends. The core of this dynamic document classification method lies in the fact that it creates classification criteria real-time by using topic modeling techniques without standardized category rules, which does not force users to use unnecessary frames. In addition, it can also search the details through the relevance analysis by calculating the relationship between the words that is difficult to grasp by word frequency alone. Rather than for logical and systematic documents, this method proposed can be used more effectively for situation analysis and retrieving information of unstructured data which do not fit the category of existing classification such as VOC (Voice Of Customer), SNS and customer reviews of Internet shopping malls and it can react to users' needs flexibly. In addition, it has no process of selecting the classification rules by the suppliers and in case there is a misclassification, it requires no manual work, which reduces unnecessary workload.

Keywords

References

  1. Joo. J, “Mediating Effects of Swift Trust on Konwledge Sharing in Social Network Services,” Korean Academic Society of Business Administration, Vol. 43, No. 3, pp. 589-612, Jun. 2014.
  2. NIA, IT Future Strategy report No. 9, pp. 1-25, Dec. 2014.
  3. Joo. K., Shin. E., Lee. J., Lee. W., “Hierarchical Automatic Classification of News Articles based on Association Rules,” Journal of Korea Multimedia Society, Vol. 14, No. 6, pp. 730-741, Jun. 2011. https://doi.org/10.9717/kmms.2011.14.6.730
  4. Song. M, "Text Mining," Aug. 2017.
  5. Hwang. H, Lee. J, “A study of a Knowledge Inference Algorithm using an Association Mining Method based on Ontologies,” Journal of Korea Multimedia Society, Vol. 11, No. 11, pp. 1566-1574, Nov. 2008.
  6. Kim. H, Rhee. H, “Trend Analysis of Data Mining Research Using Topic Network Analysis,” Journal of The Korea Society of Computer and Information, Vol. 21, No. 5, pp. 141-148, May. 2016. https://doi.org/10.9708/JKSCI.2016.21.5.141
  7. Ko. S., Lee. J., “Weighted Bayesian Automatic Document Categorization Based on Association Word Knowledge Base by Apriori Algorithm,” Journal of Korea Multimedia Society, Vol. 4, No. 2, pp. 171-181, Apr. 2001.
  8. Y. Yang, J. O. Pederson, "A Comparative study on feature selection in text categorization," In Proceedings of the 14th International Conference on Machine Learning Jul. 1997.
  9. Blei, D. M., "Probabilistic topic models," Communications of the ACM, pp. 77-84, Apr. 2012.
  10. Lee. H, Yang. S, Ko. Y, "Feature Expansion based on LDA Word Distribution for Performance Inprovement of Informal Document Classification," Journal of KIISE, Vol. 43, No. 9, PP. 1008-1014, Sept. 2017. https://doi.org/10.5626/JOK.2016.43.9.1008
  11. Sim. J, Kim. H, "A Searching Method for Legal Case Using LDA Topic Modeling," Journal of The Institute of Eletronics and Information Engineers, Vol. 54, No. 9, Sept. 2017.
  12. Park. J, Song. M, "A Study on the Research Trends in Library & Information Science in Korea using Topic Modeling," Journal of Korea Society for Information Management, Vol. 30, No. 1, pp. 7-32, Mar. 2013. https://doi.org/10.3743/KOSIM.2013.30.1.007
  13. Park. C, Kim. Y, Kim. J, Song. J, Choi. H, "Data Mining using R," Jul. 2011.