DOI QR코드

DOI QR Code

Technology Clustering Using Textual Information of Reference Titles in Scientific Paper

과학기술 논문의 참고문헌 텍스트 정보를 활용한 기술의 군집화

  • Park, Inchae (Division of Smart Management Engineering, Hansung University) ;
  • Kim, Songhee (Department of Industrial & Systems Engineering, Dongguk University) ;
  • Yoon, Byungun (Department of Industrial & Systems Engineering, Dongguk University)
  • 박인채 (한성대학교 스마트경영공학부) ;
  • 김송희 (동국대학교 산업시스템공학과) ;
  • 윤병운 (동국대학교 산업시스템공학과)
  • Received : 2020.02.25
  • Accepted : 2020.04.03
  • Published : 2020.06.30

Abstract

Data on patent and scientific paper is considered as a useful information source for analyzing technological information and has been widely utilized. Technology big data is analyzed in various ways to identify the latest technological trends and predict future promising technologies. Clustering is one of the ways to discover new features by creating groups from technology big data. Patent includes refined bibliographic information such as patent classification code whereas scientific paper does not have appropriate bibliographic information for clustering. This research proposes a new approach for clustering data of scientific paper by utilizing reference titles in each scientific paper. In this approach, the reference titles are considered as textual information because each reference consists of the title of the paper that represents the core content of the paper. We collected the scientific paper data, extracted the title of the reference, and conducted clustering by measuring the text-based similarity. The results from the proposed approach are compared with the results using existing methodologies that one is the approach utilizing textual information from titles and abstracts and the other one is a citation-based approach. The suggested approach in this paper shows statistically significant difference compared to the existing approaches and it shows better clustering performance. The proposed approach will be considered as a useful method for clustering scientific papers.

Keywords

References

  1. Arbelaitz, O., Gurrutxaga, I., Muguerza, J., PeRez, J. M., and Perona, I., An extensive comparative study of cluster validity indices, Pattern Recognition, 2013, Vol. 46, No. 1, pp. 243-256. https://doi.org/10.1016/j.patcog.2012.07.021
  2. Fujita, K., Kajikawa, Y., Mori, J., and Sakata, I., Detecting research fronts using different types of weighted citation networks, Journal of Engineering and Technology Management, 2014, Vol. 32, pp. 129-146. https://doi.org/10.1016/j.jengtecman.2013.07.002
  3. Glanzel, W. and Czerwon, H.J., A new methodological approach to bibliographic coupling and its application to research-front and other core documents, in ISSI'95, Proceedings of the fifth biennial international conference of the International Society for Scientometrics and Infometrics, River Forest, Illinois, USA, 1995, pp. 167-176.
  4. Jeong, Y. and Yoon, B., Development of patent roadmap based on technology roadmap by analyzing patterns of patent development, Technovation, 2015, Vol. 39, pp. 37-52. https://doi.org/10.1016/j.technovation.2014.03.001
  5. Jeon, Y. and Kim, Y., A study on improvement of the school space through socio-spatial network analysis, Journal of the Architectural Institute of Korea-Planning, 2019, Vol. 35, No. 5, pp. 21-30.
  6. Jeong, Y., Park, I., and Yoon, B., Identifying emerging Research and Business Development(R&BD) areas based on topic modeling and visualization with intellectual property right data, Technological Forecasting and Social Change, 2019, Vol. 146, pp. 655-672. https://doi.org/10.1016/j.techfore.2018.05.010
  7. Kim, S., Park, I., and Yoon, B., SAO2Vec : Development of an algorithm for embedding the subject-actionobject( SAO) structure using Doc2Vec, PLoS One, 2020, Vol. 15, No. 2, e0227930. https://doi.org/10.1371/journal.pone.0227930
  8. Lim, C., Yun, D., Park, I., Park, G., Koh, S., and Yoon, B., Exploring prospective research areas in UI/UX through the Analysis of Patents, Korean Management Science Review, 2015, Vol. 32, No. 4, pp. 1-18. https://doi.org/10.7737/KMSR.2015.32.4.001
  9. Meyer, M., Does science push technology? patents citing scientific literature, Research policy, 2000, Vol. 29, No. 3, pp. 409-434. https://doi.org/10.1016/S0048-7333(99)00040-2
  10. Newman, M.E. and Girvan, M., Finding and evaluating community structure in networks, Physical Review E, 2004, Vol. 69, No. 2, pp. 1-16.
  11. Park, I. and Yoon, B., Identifying promising research frontiers of pattern recognition through bibliometric analysis, Sustainability, 2018, Vol. 10, No. 11, pp. 1-32. https://doi.org/10.3390/su10020001
  12. Peters, H.P. and van Raan, A.F., Co-word-based science maps of chemical engineering, Part I : Representations by direct multidimensional scaling, Research Policy, 1993, Vol. 22, No. 1, pp. 23-45. https://doi.org/10.1016/0048-7333(93)90031-C
  13. Peters, H.P. and van Raan, A.F., Co-word-based science maps of chemical engineering, Part II : Representations by combined clustering and multidimensional scaling, Research Policy, 1993, Vol. 22, No. 1, pp. 47-71. https://doi.org/10.1016/0048-7333(93)90032-D
  14. Shen, S., Zhu, D., Rousseau, R., Su, X., and Wang, D., A refined method for computing bibliographic coupling strengths, Journal of Informetrics, Vol. 13, No. 2, 2019, pp. 605-615. https://doi.org/10.1016/j.joi.2019.01.012
  15. Shibata, N., Kajikawa, Y., and Sakata, I., Detecting potential technological fronts by comparing scientific papers and patents, Foresight, 2011, Vol. 13, No. 5, pp. 51-60. https://doi.org/10.1108/14636681111170211
  16. Wei, Y., Wang, J., Chen, T., Yu, B., and Liao, H., Frontiers of low-carbon technologies : Results from bibliographic coupling with sliding window, Journal of Cleaner Production, Vol. 190, No. 20, 2018, pp. 422-431. https://doi.org/10.1016/j.jclepro.2018.04.170