DOI QR코드

DOI QR Code

Prescriptive Analytics System Design Fusing Automatic Classification Method and Intellectual Structure Analysis Method

자동 분류 기법과 지적 구조 분석 기법을 융합한 처방적 분석 시스템 구현 방안 연구

  • 정도헌 (덕성여자대학교 문헌정보학과)
  • Received : 2017.11.14
  • Accepted : 2017.11.29
  • Published : 2017.12.30

Abstract

This study aims to introduce an emerging prescriptive analytics method and suggest its efficient application to a category-based service system. Prescriptive analytics method provides the whole process of analysis and available alternatives as well as the results of analysis. To simulate the process of optimization, large scale journal articles have been collected and categorized by classification scheme. In the process of applying the concept of prescriptive analytics to a real system, we have fused a dynamic automatic-categorization method for large scale documents and intellectual structure analysis method for scholarly subject fields. The test result shows that some optimized scenarios can be generated efficiently and utilized effectively for reorganizing the classification-based service system.

본 연구는 새로운 분석법으로 떠오르는 처방적 분석 기법을 소개하고, 이를 분류 기반의 시스템에 효율적으로 적용하는 방안을 제시하는 것을 목적으로 한다. 처방적 분석 기법은 분석의 결과를 제시함과 동시에 최적화된 결과가 나오기까지의 과정 및 다른 선택지까지 제공한다. 새로운 개념의 분석 기법을 도입함으로써 문헌 분류를 기반으로 하는 응용 시스템을 더욱 쉽게 최적화하고 효율적으로 운영하는 방안을 제시하였다. 최적화의 과정을 시뮬레이션하기 위해, 대용량의 학술문헌을 수집하고 기준 분류 체계에 따라 자동 분류를 실시하였다. 처방적 분석 개념을 적용하는 과정에서 대용량의 문헌 분류를 위한 동적 자동 분류 기법과 학문 분야의 지적 구조 분석 기법을 동시에 활용하였다. 실험의 결과로 효과적으로 서비스 분류 체계를 수정하고 재적용할 수 있는 몇 가지 최적화 시나리오를 효율적으로 도출할 수 있음을 보여 주었다.

Keywords

References

  1. 김장원, 황명권, 송사광, 김진형, 정도헌, 정한민 (2014). 지시적 분석을 위한 연구자 활동 기반의 연구자 히스토리 추적 서비스. 정보과학회논문지: 컴퓨팅의 실제 및 레터, 20(6), 359-363. (Gim, Jangwon, Hwang, Myung-Gwon, Song, Sa-Kwang, Kim, Jinhyung, Jeong Do-Heon, & Jung, Hanmin (2014). Researcher history tracking service for prescriptive analytics based on researcher activities. Journal of KIISE: Computing Practices and Letters, 20(6), 359-363.)
  2. 이원구, 신성호, 김광영, 정도헌, 윤화묵, 성원경, 이민호 (2011). 상호운용적 분류체계 관리를 위한 반자동 분류체계 관리방안. 한국콘텐츠학회논문지, 11(12), 466-474. https://doi.org/10.5392/jkca.2011.11.12.466 (Lee, Won-Goo, Shin, Sung-Ho, Kim, Kwang-Young, Jeong, Do-Heon, Yoon, Hwa-Mook, Sung, Won-Kyung, & Lee, Min-Ho (2011). Semi-automatic management of classification scheme with interoperability. The Journal of the Korea Contents Association, 11(12), 466-474. https://doi.org/10.5392/jkca.2011.11.12.466)
  3. 이재윤 (2005). 문서측 자질선정을 이용한 고속 문서분류기의 성능향상에 관한 연구. 정보관리연구, 36(4), 51-69. https://doi.org/10.1633/jim.2005.36.4.051 (Lee, Jae-Yun (2005). Improving the performance of a fast text classifier with document-side feature selection. Journal of Information Management, 36(4), 51-69.)
  4. 이재윤 (2006). 지적 구조 분석을 위한 새로운 클러스터링 기법에 관한 연구. 정보관리학회지, 23(4), 215-231. https://doi.org/10.3743/kosim.2006.23.4.215 (Lee, Jae-Yun (2006). A novel clustering method for examining and analyzing the intellectual structure of a scholarly field. Journal of the Korean Society for Information Management, 23(4), 215-231. https://doi.org/10.3743/kosim.2006.23.4.215)
  5. 정도헌 (2010). 최대 개념강도 인지기법을 이용한 데이터베이스 자동선택 방법에 관한 연구. 정보관리학회지, 27(3), 265-281. https://doi.org/10.3743/kosim.2010.27.3.265 (Jeong, Do-Heon (2010). A study on automatic database selection technique using the maximal concept strength recognition method. Journal of the Korean Society for Information Management, 27(3), 265-281. https://doi.org/10.3743/kosim.2010.27.3.265)
  6. 정도헌, 김환민, 김혜선, 신기정 (2007). 과학기술 전문용어의 주제 분야별 전문성과 자동 분류 성공률 간의 연관성 비교. 제14회 한국정보관리학회 학술대회 논문집, 31-36. (Jeong, Do-Heon, Kim, Hwan-Min, Kim, Hye-Sun, & Shin, Ki-jeong (2007). The relationship between the specificity of S&T terms and auto-classification accuracy. Proceedings of the 14th Conference of Korean Society for Information Management, 31-36.)
  7. Applied physics (2017). Wikipedia Retreived from http://en.wikipedia.org/wiki/Applied_physics
  8. Astrom, F. (2007). Changes in the LIS research front: Time-sliced cocitation analyses of LIS journal articles, 1990-2004. Journal of the American Society for Information Science and Technology, 58(7), 947-957. https://doi.org/10.1002/asi.20567
  9. Bertsimas, D., & Kallus, N. (2015). From predictive to prescriptive analytics. arXiv preprint arXiv:1402.5481.
  10. Chua, A. Y. K., & Yang, C. C. (2008). The shift towards multi-disciplinarity in information science. Journal of the American Society for Information Science and Technology, 59(13), 2156-2170. https://doi.org/10.1002/asi.20929
  11. Gartner (n.d.). Retrieved from https://www.gartner.com
  12. Glanzel, W., & Schilemmer, B. (2007). National research profiles in a changing Europe (1983-2003): An exploratory study of sectoral characteristics in the Triple Helix. Scientometrics, 70(2), 267-275. https://doi.org/10.1007/s11192-007-0203-8
  13. Hou, H., Kretschmer, H., & Liu, Z. (2008). The structure of scientific collaboration networks in Scientometrics. Scientometrics, 75(2), 189-202. https://doi.org/10.1007/s11192-007-1771-3
  14. Jeong, D. H., Kim, J., Hwang, M., Song, S. K., & Jung, H. (2012). Classification method by integrating feature property matrices for large scale data. International Conference on SMA 2012, Kunming, China.
  15. Kim, Heejung, & Lee, Jae Yun (2008). Exploring the emerging intellectual structure of archival studies using text mining: 2001-2004. Journal of Information Science, 34(3), 356-369. https://doi.org/10.1177/0165551507086260
  16. Ko, Youngjoong, & Seo, Jungyun (2004). Using the feature projection technique based on a normalized voting method for text classification. Information Processing and Management, 40(2), 191-208. https://doi.org/10.1016/s0306-4573(03)00029-3
  17. Levitt, J. M., & Thelwall, M. (2008). Is multidisciplinary research more highly cited? A macrolevel study. Journal of the American Society for Information Science and Technology, 59(12), 1973-1984. https://doi.org/10.1002/asi.20914
  18. McCallum, A., & Nigam, K. (1998). Employing EM and pool-based active learning for text classification. Proceedings of the Fifteenth International Conference on Machine Learning (ICML) 1998, 350-358.
  19. Morillo, F., Bordons, M., & Gomez, I. (2003). Interdisciplinarity in science: A tentative typology of disciplines and research areas. Journal of the American Society for Information Science and Technology, 54(13), 1237-1249. https://doi.org/10.1002/asi.10326
  20. Pajek. [Computer software]. Retrieved from http://mrvar.fdv.uni-lj.si/pajek/
  21. Prescriptive analytics (2017). Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Prescriptive_analytics
  22. Quirin, A., Cordon, O., Guerrero-Bote, V. P., Vargas-Quesada, B., & Moya-Anegon, F. (2008). A quick MST-based algorithm to obtain pathfinder networks(${\infty}$, n-1). Journal of the American Society for Information Science and Technology, 59(12), 1912-1924. https://doi.org/10.1002/asi.20904
  23. Schvaneveldt, R. W. (1990). Pathfinder associative networks: Studies in knowledge organization. Norwood, NJ: Ablex.
  24. Settles, B. (2010). Active learning literature survey. Computer Sciences Technical Report, 1648, University of Wisconsin-Madison
  25. Song, Sa-Kwang, Kim, Donald J., Hwang, Myunggwon, Kim, Jangwon, Jeong, Do-Heon, Lee, Seungwoo, ... Sung, Wonkyung (2013). Prescriptive analytics system for improving research power. Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on Computational Science and Engineering. https://doi.org/10.1109/cse.2013.169
  26. Tague-Sutcliffe, J. (1992). An introduction to informetrics. Information Processing & Management, 28(1), 1-3. https://doi.org/10.1016/0306-4573(92)90087-G
  27. White, H. D. (2003). Pathfinder networks and author cocitation analysis: A remapping of paradigmatic information scientists. Journal of the American Society for Information Science and Technology, 54(5), 423-434. https://doi.org/10.1002/asi.10228
  28. Zhao, D., & Strotmann, A. (2007). Can citation analysis of web publications better detect research fronts? Journal of the American Society for Information Science and Technology, 58(9), 1285-1302. https://doi.org/10.1002/asi.20617