DOI QR코드

DOI QR Code

ChatGPT가 자동 생성한 더블린 코어 메타데이터의 품질 평가: 국내 도서를 대상으로

Quality Evaluation of Automatically Generated Metadata Using ChatGPT: Focusing on Dublin Core for Korean Monographs

  • 김선욱 (경북대학교 사회과학대학 문헌정보학과) ;
  • 이혜경 (경북대학교 사회과학대학 문헌정보학과) ;
  • 이용구 (경북대학교 사회과학대학 문헌정보학과)
  • 투고 : 2023.05.15
  • 심사 : 2023.06.08
  • 발행 : 2023.06.30

초록

이 연구의 목적은 ChatGPT가 도서의 표지, 표제지, 판권기 데이터를 활용하여 생성한 더블린코어의 품질 평가를 통하여 ChatGPT의 메타데이터의 생성 능력과 그 가능성을 확인하는 데 있다. 이를 위하여 90건의 도서의 표지, 표제지와 판권기 데이터를 수집하여 ChatGPT에 입력하고 더블린 코어를 생성하게 하였으며, 산출물에 대해 완전성과 정확성 척도로 성능을 파악하였다. 그 결과, 전체 데이터에 있어 완전성은 0.87, 정확성은 0.71로 준수한 수준이었다. 요소별로 성능을 보면 Title, Creator, Publisher, Date, Identifier, Right, Language 요소가 다른 요소에 비해 상대적으로 높은 성능을 보였다. Subject와 Description 요소는 완전성과 정확성에 대해 다소 낮은 성능을 보였으나, 이들 요소에서 ChatGPT의 장점으로 알려진 생성 능력을 확인할 수 있었다. 한편, DDC 주류인 사회과학과 기술과학 분야에서 Contributor 요소의 정확성이 다소 낮았는데, 이는 ChatGPT의 책임표시사항 추출 오류 및 데이터 자체에서 메타데이터 요소용 서지 기술 내용의 누락, ChatGPT가 지닌 영어 위주의 학습데이터 구성등에 따른 것으로 판단하였다.

The purpose of this study is to evaluate the Dublin Core metadata generated by ChatGPT using book covers, title pages, and colophons from a collection of books. To achieve this, we collected book covers, title pages, and colophons from 90 books and inputted them into ChatGPT to generate Dublin Core metadata. The performance was evaluated in terms of completeness and accuracy. The overall results showed a satisfactory level of completeness at 0.87 and accuracy at 0.71. Among the individual elements, Title, Creator, Publisher, Date, Identifier, Rights, and Language exhibited higher performance. Subject and Description elements showed relatively lower performance in terms of completeness and accuracy, but it confirmed the generation capability known as the inherent strength of ChatGPT. On the other hand, books in the sections of social sciences and technology of DDC showed slightly lower accuracy in the Contributor element. This was attributed to ChatGPT's attribution extraction errors, omissions in the original bibliographic description contents for metadata, and the language composition of the training data used by ChatGPT.

키워드

참고문헌

  1. Jung, Hanmin & Park, Junghun (2023). Design and issues of writing literatures using ChatGPT. Journal of Knowledge Information Technology and Systems, 18(1), 31-40. https://doi.org/10.34163/jkits.2023.18.1.004
  2. Jung, Jong Jin, Kim, Kyung Won, & Kim, Gu Hwan (2020). A Study on automatic metadata extraction to support dataset search. Proceedings of KICS Summer Conference 2020, 867-868.
  3. Lee, Chi-Ju, Lee, Sung-Sook, Kim, Sang-Gyu, Choi, Sung-Hwan, & Kook, Min-Sang (2000). Dublin core and MARC. KLA Buttetin, 41(6), 4-34.
  4. Lee, Kyungho (2013). Information Science (3rd ed.). Daegu: Inswaemadang.
  5. Lee, Myounggyu (2010). A study on the description elements of the book colophon in Korea. Journal of Korean Library and Information Science Society, 41(1), 211-231. https://doi.org/10.16981/kliss.41.1.201003.211
  6. Lee, Yong-Gu & Kim, Byungkyu (2011). A study on quantitative measurement of metadata quality for journal articles. Journal of the Korean Society for Information Management, 28(1), 309-326. https://doi.org/10.3743/KOSIM.2011.28.1.309
  7. Lee, Yunhee, Kim, Changsik, & Ahn, Hyunchul (2023). A study on the ChatGPT: focused on the news big data service and ChatGPT use cases. Journal of the Korea Society of Digital Industry and Information Management, 19(1), 139-151.
  8. Noh, Dae-won (2023). Fiction-writing robot: ChatGPT and AI-generated literature. Journal of Korean Literary Criticism, 77, 125-160.
  9. Song, Hak Jun, Song, Hyoung-yong, & Lee, JiEun (2023). A study on the future of tourism industry and ChatGPT. Journal of Hotel & Resort, 22(1), 115-128.
  10. Yang, Gi-Chul & Park, Jeong-Ran (2018). Automatic extraction of metadata information for library collections. The International Journal of Advanced Culture Technology, 6(2), 117-122. https://doi.org/10.17703/IJACT.2018.6.2.117
  11. Yong, Sung-Jung, Park, Hyo-Gyeong, You, Yeon-Hwi, & Moon, Il-Young (2021). Method of automatically generating metadata through audio analysis of video content. Journal of Advanced Navigation Technology, 25(6), 557-561.
  12. Armengol-Estape, J., Bonet, O. G., & Melero, M. (2021). On the Multilingual Capabilities of Very Large-Scale English Language Models. arXiv e-prints. https://doi.org/10.48550/arXiv.2108.13349
  13. Chapman, A. & Massey, O. (2002). A catalogue quality audit tool. Library management, 23(6/7), 314-324. https://doi.org/10.1108/01435120210432282
  14. Cox, C. & Tzoc, E. (2023). ChatGPT: Implications for academic libraries. College & Research Libraries News, 84(3), 99. https://doi.org/10.5860/crln.84.3.99
  15. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint. https://doi.org/10.48550/arXiv.1810.04805
  16. Guinchard, C. (2002). Dublin Core use in libraries: a survey. OCLC Systems & Services: International digital library perspectives, 18(1), 40-50. https://doi.org/10.1108/10650750210418190
  17. Han, H., Giles, C. L., Manavoglu, E., Zha, H., Zhang, Z., & Fox, E. A. (2003). Automatic document metadata extraction using support vector machines. In 2003 Joint Conference on Digital Libraries, 37-48. https://doi.org/10.1109/JCDL.2003.1204842
  18. Huang, J., Shao, H., Chang, K. C. C., Xiong, J., & Hwu, W. M. (2022). Understanding jargon: Combining extraction and generation for definition modeling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 3994-4004.
  19. Irvin, K. M. (2003). Comparing information retrieval effectiveness of different metadata generation methods. Master's thesis, University of North Carolina at Chapel Hill, United States. https://doi.org/10.17615/grff-0v98
  20. James, R. & Weiss, A. (2012). An assessment of Google Books' metadata. Journal of Library Metadata, 12(1), 15-22. https://doi.org/10.1080/19386389.2012.652566
  21. Johnson, D., Goodman, R., Patrinely, J., Stone, C., Zimmerman, E., Donald, R., Chang, S., Berkowitz, S., Finn, A., Jahangir, E., Scoville, E., Reese, T., Friedman, D., Bastarache, J., Heijden, Y., Wright, J., Carter, N., Alexander, M., Choe, J., Chastain, C., Zic, J., Horst, S., Turker, I., Agarwal, R., Osmundson, E., Idrees, K., Kiernan, C., Padmanabhan, C., Bailey, C., Schlegel, C., Chambless, L., Gibson, M., Osterman, T., & Wheless, L. (2023). Assessing the accuracy and reliability of AI-generated medical responses: an evaluation of the Chat-gpt model. https://doi.org/10.21203/rs.3.rs-2566942/v1 
  22. Kirtania, D. K. & Patra, S. K. (2023). OpenAI ChatGPT Generated Content and Similarity Index: A study of selected terms from the Library & Information Science (LIS). Qeios. https://doi.org/10.32388/FO1CP6.3
  23. Liu, V. & Chilton, L. B. (2022). Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 384, 1-23. https://doi.org/10.1145/3491102.3501825
  24. Lund, B. D. & Wang, T. (2023). Chatting about ChatGPT: how may AI and GPT impact academia and libraries?. Library Hi Tech News. https://doi.org/10.1108/LHTN-01-2023-0009
  25. Margaritopoulos, M., Margaritopoulos, T., Mavridis, I., & Manitsaris, A. (2012). Quantifying and measuring metadata completeness. Journal of the American Society for Information Science and Technology, 63(4), 724-737. https://doi.org/10.1002/asi.21706
  26. Moradi, M., Blagec, K., Haberl, F., & Samwald, M. (2021). Gpt-3 models are poor few-shot learners in the biomedical domain. arXiv preprint. https://doi.org/10.48550/arXiv.2109.02555
  27. Ochoa, X. & Duval, E. (2009). Automatic evaluation of metadata quality in digital repositories. International journal on digital libraries, 10, 67-91. https://doi.org/10.1007/s00799-009-0054-4
  28. Ojokoh, B. A., Adewale, O. S., & Falaki, S. O, (2009). Automated document metadata extraction. Journal of Information Science, 35(5), 563-570. https://doi.org/10.1177/0165551509105195
  29. Park, J. R. (2009). Metadata quality in digital repositories: A survey of the current state of the art. Cataloging & classification quarterly, 47(3-4), 213-228. https://doi.org/10.1080/01639370902737240
  30. Qu, Y., Liu, P., Song, W., Liu, L., & Cheng, M. (2020). A text generation and prediction system: pre-training on new corpora using BERT and GPT-2. In 2020 IEEE 10th international conference on electronics information and emergency communication (ICEIEC), 323-326. IEEE. https://doi.org/10.1109/ICEIEC49280.2020.9152352
  31. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.
  32. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
  33. Rombach, R., Blattmann, A., & Ommer, B. (2022). Text-guided synthesis of artistic images with retrieval-augmented diffusion models. arXiv preprint. https://doi.org/10.48550/arXiv.2207.13038
  34. Sokvitne, L. (2000). An evaluation of the effectiveness of current Dublin Core metadata for retrieval. In VALA conference.
  35. Underwood, W. (2020). Automatic Extraction of Dublin Core Metadata from Presidential E-records. In 2020 IEEE International Conference on Big Data (Big Data), 1931-1938. https://doi.org/10.1109/BigData50022.2020.9377943
  36. Valls-Vargas, J. (2013). Narrative extraction, processing and generation for interactive fiction and computer games. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 9(6), 37-40. https://doi.org/10.1609/aiide.v9i6.12600
  37. White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., & Schmidt, D. C. (2023). A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint. https://doi.org/10.48550/arXiv.2302.11382
  38. Zavalina, O. L. & Burke, M. (2021). Assessing skill building in metadata instruction: quality evaluation of dublin core metadata records created by graduate students. Journal of Education for Library and Information Science, 62(4), 423-442. https://doi.org/10.3138/jelis.62-4-2020-0083
  39. Zhang, J. & Dimitroff, A. (2004). Internet search engines' response to metadata Dublin Core implementation. Journal of Information Science, 30(4), 310-320. https://doi.org/10.1177/0165551504045851
  40. Zhang, J. & Dimitroff, A. (2005a). The impact of webpage content characteristics on webpage visibility in search engine results (Part I). Information Processing & Management, 41(3), 665-690. https://doi.org/10.1016/j.ipm.2003.12.001
  41. Zhang, J. & Dimitroff, A. (2005b). The impact of metadata implementation on webpage visibility in search engine results (Part II). Information processing & management, 41(3), 691-715. https://doi.org/10.1016/j.ipm.2003.12.002
  42. Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large language models are human-level prompt engineers. arXiv preprint. https://doi.org/10.48550/arXiv.2211.01910