DOI QR코드

DOI QR Code

Combining Ego-centric Network Analysis and Dynamic Citation Network Analysis to Topic Modeling for Characterizing Research Trends

자아 중심 네트워크 분석과 동적 인용 네트워크를 활용한 토픽모델링 기반 연구동향 분석에 관한 연구

  • 유소영 (한남대학교 문헌정보학과)
  • Received : 2015.02.24
  • Accepted : 2015.03.09
  • Published : 2015.03.30

Abstract

The combined approach of using ego-centric network analysis and dynamic citation network analysis for refining the result of LDA-based topic modeling was suggested and examined in this study. Tow datasets were constructed by collecting Web of Science bibliographic records of White LED and topic modeling was performed by setting a different number of topics on each dataset. The multi-assigned top keywords of each topic were re-assigned to one specific topic by applying an ego-centric network analysis algorithm. It was found that the topical cohesion of the result of topic modeling with the number of topic corresponding to the lowest value of perplexity to the dataset extracted by SPLC network analysis was the strongest with the best values of internal clustering evaluation indices. Furthermore, it demonstrates the possibility of developing the suggested approach as a method of multi-faceted research trend detection.

이 연구에서는 토픽 모델링 결과 해석의 용이성을 위하여, 동적 인용 네트워크를 활용하여 LDA 기반 토픽 모델링의 토픽 수를 설정하고 중복 배치된 주요 키워드를 자아 중심 네트워크 분석을 통해 재배치하여 제시하는 방법을 제안하였다. 'White LED' 두 분야의 논문 데이터를 이용하여 분석한 결과, 동적 인용 네트워크 분석을 통해 형성된 분석대상 문헌집단에 혼잡도에 따른 토픽수를 사용하고 중복 분류된 토픽 내 주요 키워드를 자아중심 네트워크 분석 기법을 적용하여 재배치한 결과가 토픽 간의 중복도가 가장 낮은 것으로 나타났다. 따라서 동적 인용 네트워크 및 자아 중심 네트워크 분석을 적용함으로써 토픽모델링에 의한 분석 결과를 보완하는 다면적인 연구 동향 분석이 가능할 것으로 보인다.

Keywords

Acknowledgement

Supported by : 한남대학교

References

  1. 박자현, 송민 (2013). 토픽모델링을 활용한 국내 문헌정보학 연구동향 분석. 정보관리학회지, 30(1), 7-32. http://dx.doi.org/10.3743/KOSIM.2013.30.1.007 (Park, Ja-Hyun, & Song, Min (2013). A study on the research trends in Library & Information Science in Korea using topic modeling. Journal of the Korean Society for Information Management, 30(1), 7-32. http://dx.doi.org/10.3743/KOSIM.2013.30.1.007)
  2. 서은경, 유소영 (2013). 국내 정보학분야 연구동향 분석, 2000-2011. 정보관리학회지, 30(4), 215-239. http://dx.doi.org/10.3743/KOSIM.2013.30.4.215 (Seo, Eun-Gyoung, & Yu, So-Young (2013). Detecting research trends in Korean information science research, 2000-2011. Journal of the Korean Society for Information Management, 30(4), 215-239. http://dx.doi.org/10.3743/KOSIM.2013.30.4.215)
  3. 유소영 (2013). 문헌 단위 인용 네트워크 구조와 Topic Descriptor Profile을 활용한 연구경향 분석에 관한 연구. 2013 한국정보관리학회 추계 학술대회 논문집, 39-58. (Yu, So-Young (2013). Applying TDP (Topic Descriptor Profile) with article-level citation flow for analyzing research trend, In Proceedings of the 2013 Korean Society for Information Management Conference in Autumn (pp. 39-58). Seoul: Korean Society for Information Management.)
  4. 이재윤, 김판준, 강대신, 김희정, 유소영, 이우형 (2011). 계량서지적 기법을 활용한 LED 핵심 주제영역의 연구 동향 분석. 정보관리연구, 42(3), 1-26. http://dx.doi.org/10.1633/JIM.2011.42.3.001 (Lee, Jae-Yun, Kim, Pan-Jun, Kang, Dae-Shin, Kim, Hee-Jung, Yu, So-Young, & Lee, Woo-Hyoung (2011). A biliometric analysis on LED research. Journal of Information Management, 42(3), 1-26.)
  5. 정우성, 양현재 (2013). 과학계량학 연구동향 및 과학기술 정책 분야 응용가능성. KISTEP ISSUE PAPER2013-06. 한국과학기술기획평가원. (Chung, Woo-Sung, & Yang, Hyun-Chae (2013). ISSUE PAPER 2013-06. KISTEP.)
  6. 한국과학기술정보연구원 (2013). 미래기술백서 2013. 한국과학기술정보연구원. Retrieved from http://mirian.kisti.re.kr/utility/tech_book/tech_book.jsp (KISTI (2013). White Paper on Future Technologies 2013. KISTI.)
  7. 한국과학기술정보연구원 (2014). 미래기술백서 2014. 한국과학기술정보연구원. Retrieved from http://mirian.kisti.re.kr/utility/tech_book/tech_book.jsp (KISTI (2014). White Paper on Future Technologies 2014. KISTI.)
  8. Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2, 224-227.
  9. de Nooy, W., Mrvar, A., & Batagelj, V. (2011). Exploratory social network analysis with Pajek (Revised and expanded second edition). Cambridge: Cambridge University Press.
  10. Ding, W., & Chen, C. (2014). Dynamic topic detection and tracking: A comparison of HDP, C-word, and cocitation methods. Journal of the Association for Information Science and Technology, 65(10), 2084-2097. https://doi.org/10.1002/asi.23134
  11. Garfield, E. (2001, September 19). From computational linguistics to algorithmic historiography. Lazerow lecture held in conjunction with panel on "Knowledge and language: Building large-scale knowledge bases for intelligent applications," presented at the University of Pittsburgh. Retrieved from http://garfield.library.upenn.edu/papers/pittsburgh92001.pdf
  12. Garfield, E. (2001, November 27). From bibliographic coupling to co-citation analysis via algorithmic historio-bibliography: A citationist's tribute to Belver C. Griffith. Lazerow Lecture presented at Drexel University, Philadelphia, PA. Retrieved from http://garfield.library.upenn.edu/papers/drexelbevergrif?th92001.pdf
  13. Garfield, E., Pudovkin A. I., & Istomin, V. S. (2002). Algorithmic citation-linked historiography-Mapping the literature of science. Proceedings of the American Society for Information Science and Technology Annual Meeting, 39, 14-24.
  14. Garfield, E., Pudovkin, A. I., & Istomin, V. S. (2003). Why do we need algorithmic historiography? Journal of the American Society for Information Science and Technology, 54(5), 400-412. http://dx.doi.org/10.1002/asi.10226
  15. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228-5235. https://doi.org/10.1073/pnas.0307752101
  16. Hansen, D., Shneiderman, B., & Smith, M. A. (2010). Analyzing social media networks with NodeXL: Insights from a connected world. Morgan Kaufmann.
  17. Huang, W., Wu, Z., Liang, C., Mitra, P., & Giles, C. L. (2015). A Neural Probabilistic Model for Context BasedCitation Recommendation.
  18. Hummon, N. P., & Doreian, P. (1989). Connectivity in a citation network: The development of DNA theory. Social Networks, 11, 39-63. https://doi.org/10.1016/0378-8733(89)90017-8
  19. Jiang, Z. (2015). Chronological scientific information recommendation via supervised dynamic topic modeling. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (pp. 453-458). ACM.
  20. Mccallum, A., Mimno, D. M., & Wallach, H. M. (2009). Rethinking LDA: Why priors matter. In Advances in Neural Information Processing Systems (pp. 1973-1981).
  21. Ramage, D., Hall, D., Nallapati, R., & Manning, C. D. (2009a, August). Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1 (pp. 248-256). Association for Computational Linguistics.
  22. Ramage, D., Rosen, E., Chuang, J., Manning, C. D., & McFarland, D. A. (2009b, December). Topic modeling for the social sciences. In NIPS 2009 Workshop on Applications for Topic Models: Text and Beyond (Vol. 5).
  23. Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53-65. https://doi.org/10.1016/0377-0427(87)90125-7
  24. Saka, A., & Igami, M. (2014). Science Map 2010&2012. Policy (NISTEP REPORT No. 159).
  25. Song, Z. (2010). Research on text categorization based on LDA. Matster Degree Dissertation. Xi'an University of Techonology, Xi'an, China.
  26. Teh, Y. W., Jordan, M. I., Beal, M. J., & Blei, D. M. (2006). Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476), 1566-1581. https://doi.org/10.1198/016214506000000302
  27. Walesiak M., & Dudek A. (2010). The cluster sim package for R. University of Wraclow, Wraclow Retrieved from http://keii.ue.wroc.pl/clusterSim
  28. Yu, S. Y. (2014). Exploratory study of developing a synchronization-based approach for multi-step discovery of knowledge structures. Journal of Information Science Theory and Practice, 2 (2) Korea Institute of Science and Technology Information. doi:10.1633/JISTaP.2014.2.2.2