DOI QR코드

DOI QR Code

Recommending Core and Connecting Keywords of Research Area Using Social Network and Data Mining Techniques

소셜 네트워크와 데이터 마이닝 기법을 활용한 학문 분야 중심 및 융합 키워드 추천 서비스

  • 조인동 (국민대학교 BIT전문대학원) ;
  • 김남규 (국민대학교 경영정보학부)
  • Received : 2010.12.30
  • Accepted : 2011.01.13
  • Published : 2011.03.31

Abstract

The core service of most research portal sites is providing relevant research papers to various researchers that match their research interests. This kind of service may only be effective and easy to use when a user can provide correct and concrete information about a paper such as the title, authors, and keywords. However, unfortunately, most users of this service are not acquainted with concrete bibliographic information. It implies that most users inevitably experience repeated trial and error attempts of keyword-based search. Especially, retrieving a relevant research paper is more difficult when a user is novice in the research domain and does not know appropriate keywords. In this case, a user should perform iterative searches as follows : i) perform an initial search with an arbitrary keyword, ii) acquire related keywords from the retrieved papers, and iii) perform another search again with the acquired keywords. This usage pattern implies that the level of service quality and user satisfaction of a portal site are strongly affected by the level of keyword management and searching mechanism. To overcome this kind of inefficiency, some leading research portal sites adopt the association rule mining-based keyword recommendation service that is similar to the product recommendation of online shopping malls. However, keyword recommendation only based on association analysis has limitation that it can show only a simple and direct relationship between two keywords. In other words, the association analysis itself is unable to present the complex relationships among many keywords in some adjacent research areas. To overcome this limitation, we propose the hybrid approach for establishing association network among keywords used in research papers. The keyword association network can be established by the following phases : i) a set of keywords specified in a certain paper are regarded as co-purchased items, ii) perform association analysis for the keywords and extract frequent patterns of keywords that satisfy predefined thresholds of confidence, support, and lift, and iii) schematize the frequent keyword patterns as a network to show the core keywords of each research area and connecting keywords among two or more research areas. To estimate the practical application of our approach, we performed a simple experiment with 600 keywords. The keywords are extracted from 131 research papers published in five prominent Korean journals in 2009. In the experiment, we used the SAS Enterprise Miner for association analysis and the R software for social network analysis. As the final outcome, we presented a network diagram and a cluster dendrogram for the keyword association network. We summarized the results in Section 4 of this paper. The main contribution of our proposed approach can be found in the following aspects : i) the keyword network can provide an initial roadmap of a research area to researchers who are novice in the domain, ii) a researcher can grasp the distribution of many keywords neighboring to a certain keyword, and iii) researchers can get some idea for converging different research areas by observing connecting keywords in the keyword association network. Further studies should include the following. First, the current version of our approach does not implement a standard meta-dictionary. For practical use, homonyms, synonyms, and multilingual problems should be resolved with a standard meta-dictionary. Additionally, more clear guidelines for clustering research areas and defining core and connecting keywords should be provided. Finally, intensive experiments not only on Korean research papers but also on international papers should be performed in further studies.

대부분의 연구포털 사이트는 관심 분야의 논문을 획득하고자 하는 연구자를 대상으로 한 서비스를 주로 제공하고 있다. 하지만 이러한 서비스는 정확한 서지사항을 알고 있는 일부 사용자의 경우 손쉽게 이용할 수 있지만, 대부분의 이용자는 원하는 자료를 획득하기 위해 키워드 검색을 통한 반복적 시행착오를 겪게 된다. 특히 사용자가 익숙하지 않은 분야의 논문을 검색하는 경우에는, 찾고자 하는 논문의 적절한 키워드 자체를 알지 못하여 검색에 큰 어려움을 겪게 된다. 이러한 한계를 극복하기 위해 일부 연구포털 사이트에서는 온라인 쇼핑몰의 상품 추천에 주로 사용되어온 연관관계 분석 기반 키워드 추천 서비스를 채택하고 있다. 하지만 연관관계 분석에만 기반한 키워드 추천 방식은 두 키워드간의 단편적인 관계만을 알려줄 뿐, 해당 학술 분야와 관련된 전체 키워드 간의 복합적 연결 관계를 보여주기에는 한계가 있다. 따라서 본 논문에서는 연관관계 분석을 통해 빈발 출현 키워드 쌍을 추출하고 이를 근거로 전체 키워드 간 네트워크를 구축함으로써, 학술 분야별 중심 키워드 및 분야 간 융합을 위한 연계 키워드를 추천하기 위한 방법을 제시하고자 한다.

Keywords

References

  1. 김용학, 사회 연결망 이론, 박영사, 2003a.
  2. 김용학, 사회 연결망 분석, 박영사, 2003b.
  3. 김남규, "장바구니 크기가 연관규칙 척도의 정확성에 미치는 영향", 경영정보학연구, 18권 2호 (2008), 95-114.
  4. 김지혜, 이경호, 이혜정, 박두순, "전자상거래를 위한 데이터마이닝 기반 상품 추천 시스템 개발," 한국정보기술학회논문지, 2권 1호(2004), 47-54.
  5. 손동원, 사회 네트워크 분석, 경문사, 2002.
  6. 안현철, 한인구, 김경재, "연관규칙기법과 분류모형을 결합한 상품 추천 시스템 : G인터넷 쇼핑몰의 사례", Information System Review, 8 권 1호(2006), 181-201.
  7. 윤성준, "데이터마이닝 기법을 통한 백화점의 고객이탈예측 모형 연구", 한국마케팅저널, 6권 4호(2005), 45-72.
  8. 조윤호, 방정혜, "신상품 추천을 위한 사회 연결망 분석의 활용", 지능정보연구, 15권 14호(2009), 183-199
  9. 최일영, 김재경, "제품 네트워크 분석을 이용한 고객의 구매제품 특성 비교 연구", 한국경영과학회지 춘계학술대회, (2009), 570-576.
  10. 최창현, "조직의 비공식 연결망에 관한 연구:사회연결망 분석의 적용", 한국사회와행정연구, 17권 1호(2006), 1-23.
  11. 최일영, 이용, 김재경, "사회 네트워크 분석에 기반한 도서관 학술 DB 이용 패턴 연구 : K대학도서관 학술DB 이용 사례", 정보관리학회지, 27권 1호(2010), 25-40.
  12. Balabanovic, M. and Y. shohm, "Fab : Content based, Collaborative Recommendation", communications of the ACM, Vol.40, No.3(1997), 66-72. https://doi.org/10.1145/245108.245124
  13. Basu, C., H. Hirsh, W. W. Cohen, and C. Nevill-Manning, "Technical Paper Recommendation A Study in Combining Multiple Information Sources", Journal of Artificial intelligence Research, (2001), 231-252.
  14. Belkin, N. J. and W. B. Croft, "Information Filtering and InformationRretrieval : two sides of the same coin?", Communications of the ACM, Vol.35(1992), 29-38.
  15. Freeman, L. C., Social Network Analysis, London : SAGE, 2008.
  16. Funakoshi, K. and T. Ohguri, "A Content-based collaborative recommender system with detailed use of evaluations", Proceedings of the 4th International conference on knowledge-based Intelligent Engineering systems and Allied Technologies, 2007.
  17. Huang, Z., H. Chen. and D. Zeng, "Applying Associative Retrieval Techniques to Alleviate the Sparsity Problem in Collaborative Filtering", ACM Transactions on Information Systems, Vol.22, No.1(2004), 116-142. https://doi.org/10.1145/963770.963775
  18. Han, J. and M. Kamber, Data Mining : Concepts and Techniques. Morgan Kaufmann Publishers, California, 2007.
  19. Kauffiman, S., "The Origins of Order", Oxford University Press, 1993.
  20. Scott, J., Social network Analysis : A Handbook. London : SAGE, 2000.
  21. Wang, W . F., Y. L. Chung, M. H. Hsu and A. C. Keh, "A Personalized Recommender System for the Cosmetic Business", Expert Systems with Applications, Vol.26, No.3(2004), 427-434. https://doi.org/10.1016/j.eswa.2003.10.001

Cited by

  1. 통계적 접근법을 기초로 하는 지능형 교육 지원 시스템 vol.18, pp.1, 2011, https://doi.org/10.13088/jiis.2012.18.1.109
  2. Does Online Social Network Contribute to WOM Effect on Product Sales? vol.18, pp.2, 2012, https://doi.org/10.13088/jiis.2012.18.2.085
  3. The Viral Effect of Online Social Network on New Products Promotion: Investigating Information Diffusion on Twitter vol.18, pp.2, 2011, https://doi.org/10.13088/jiis.2012.18.2.107
  4. 온라인 상품 판매 성과에 영향을 미치는 상품 소개글 효과 측정 기법 vol.18, pp.4, 2011, https://doi.org/10.13088/jiis.2012.18.4.001
  5. SNS에서의 개선된 소셜 네트워크 분석 방법 vol.18, pp.4, 2011, https://doi.org/10.13088/jiis.2012.18.4.117