DOI QR코드

DOI QR Code

Analyzing the Sentence Structure for Automatic Identification of Metadata Elements based on the Logical Semantic Structure of Research Articles

연구 논문의 의미 구조 기반 메타데이터 항목의 자동 식별 처리를 위한 문장 구조 분석

  • 송민선 (대림대학교 아동문헌정보과)
  • Received : 2018.08.19
  • Accepted : 2018.09.17
  • Published : 2018.09.30

Abstract

This study proposes the analysis method in sentence semantics that can be automatically identified and processed as appropriate items in the system according to the composition of the sentences contained in the data corresponding to the logical semantic structure metadata of the research papers. In order to achieve the purpose, the structure of sentences corresponding to 'Research Objectives' and 'Research Outcomes' among the semantic structure metadata was analyzed based on the number of words, the link word types, the role of many-appeared words in sentences, and the end types of a word. As a result of this study, the number of words in the sentences was 38 in 'Research Objectives' and 212 in 'Research Outcomes'. The link word types in 'Research Objectives' were occurred in the order such as Causality, Sequence, Equivalence, In-other-word/Summary relation, and the link word types in 'Research Outcomes' were appeared in the order such as Causality, Equivalence, Sequence, In-other-word/Summary relation. Analysis target words like '역할(Role)', '요인(Factor)' and '관계(Relation)' played a similar role in both purpose and result part, but the role of '연구(Study)' was little different. Finally, the verb endings in sentences were appeared many times such as '~고자', '~였다' in 'Research Objectives', and '~었다', '~있다', '~였다' in 'Research Outcomes'. This study is significant as a fundamental research that can be utilized to automatically identify and input the metadata element reflecting the common logical semantics of research papers in order to support researchers' scholarly sensemaking.

본 연구는 연구논문의 논리적 의미 구조 메타데이터 항목에 해당하는 데이터에 담겨 있는 문장의 구성에 따라 시스템에서 적절한 항목으로 자동 식별 처리될 수 있도록 하는, 문장의미론(Sentence Semantics)적 분석 방법을 제안하고자 하는 목적으로 수행되었으며, 의미 구조 메타데이터 항목 중 'Research Objectives'와 'Research Outcomes'에 해당하는 연구 논문 문장의 구조를 어절 수, 접속어 종류, 다수 출현한 단어들의 문장 내 역할, 문장에서 다수 출현한 어미 형태 등을 기준으로 분석해 정리하였다. 연구 결과, 문장들의 어절 수는 'Research Objectives'는 평균 38개, 'Research Outcomes'는 평균 212개로 나타났으며, 접속어의 경우 'Research Objectives'는 인과-순접-대등-환언/요약 관계를 나타내는 접속어 순으로, 'Research Outcomes'는 인과-대등-순접-환언/요약 관계를 나타내는 접속어 순으로 많이 출현한 것으로 파악되었다. 출현빈도가 높은 분석 대상 단어들은 각각 문장 내에서 주어, 목적어, 서술어 역할 등으로 사용되고 있었으며, '역할'이나 '요인', '관계'는 목적이나 결과 부분 모두에서 비슷한 역할을 담당하고 있었지만 '연구'는 같은 단어라도 연구의 목적 부분과 결과 부분에서 사용되는 역할에 차이를 보였다. 마지막으로 문장 내 동사의 어미는 'Research Objectives'에서 '~고자'와 '~였다', 'Research Outcomes'에서 '~었다', '~있다', '~였다'가 많이 출현하였다. 본 연구는 연구자의 학술적 이해형성을 지원하기 위해 연구논문이 담고 있는 공통된 논리적 의미를 반영한 메타데이터 요소의 자동 식별과 입력 방안을 제시하는 데 활용할 수 있는 기초 연구로서 의의가 있다.

Keywords

References

  1. Kang, Beomil, Song, Min, & Jho, Whasun (2013). A study on opinion mining of newspaper texts based on topic modeling. Journal of the Korean Society for Library and Information Science, 47(4), 315-334. http://doi.org/10.4275/KSLIS.2013.47.4.315
  2. Ko, Young-Man, & Song, Inseok (2011). A study on the knowledge organizing system of research papers based on semantic relation of the knowledge structure. Journal of the Korean Society for Information Management, 28(1), 145-170. http://doi.org/10.3743/KOSIM.2011.28.1.145
  3. Kim, Meen Chul, Shim, Kyu Seung, Han, Nam Gi, Kim, Ye Eun, & Song, Min (2013). Automatic classification of malicious usage on Twitter. Journal of the Korean Society for Library and Information Science, 47(1), 269-286. http://doi.org/10.4275/KSLIS.2013.47.1.269
  4. Kim, Jin-Ok, Lee, Sun-Sook, & Yong, Hwan-Seung (2011). Automatic classification scheme of opinions written in Korean. Journal of KIISE: Database, 38(6), 423-428.
  5. Ministry of Science, ICT and Future Planning (2015). SCI Analysis Research.
  6. Song, Min-Sun, & Ko, Young Man (2015). A study on the metadata based on the semantic structure of the Korean studies research articles. Journal of Korean Library and Information Science Society, 46(3), 277-299. http://doi.org/10.16981/kliss.46.3.201509.277
  7. Song, Min-Sun, Ko, Young Man, & Lee, Seung-Jun (2016). A study on developing a metadata search system based on the text structure of Korean studies research articles. Journal of the Korean Society for Information Management, 33(3), 155-176. http://doi.org/10.3743/KOSIM.2016.33.3.155
  8. Shin, Joon-Choul, & Ock, Cheol-Young (2016). Semantic resources for Korean semantic analysis and word sense disambiguation. Journal of KIISE, 34(8), 8-16
  9. Ahn, Ae-Lim, Han, Yong-Jin, Park, Se-Young, & Nam, Jee-Sun (2012). Processing of adjectives of non-deterministic opinion for human opinion classification system. Journal of KIISE: Transactions on Computing Practices, 18(2), 158-162.
  10. Yoo, Sa-Rah, Lee, Hye-Won, & Song, Inseok (2009). A study on the application and management framework of social science scholarly ontology for semantic information navigation. Journal of the Korean Society for Library and Information Science, 43(2), 277-298. https://doi.org/10.4275/KSLIS.2009.43.2.277
  11. Yoon, Ku-Ho (1999). Theory and practice of automatic indexing. Journal of the Korean Library and Information Science Society, 30(3), 27-51.
  12. Yoon, Sung-Hee, & Paek, Seon-Uck (2004). Efficient classification of user's natural language question types using word semantic information. Journal of the Korean Society for Information Management, 21(4), 251-263. https://doi.org/10.3743/KOSIM.2004.21.4.251
  13. Jung, Youngmi, & Lee, Tae-Young (1982). Statistical techniques for automatic indexing and some experiments with Korean documents. Library Science, 9, 99-118
  14. Korea Citation Index. Retrieved from http://www.kci.go.kr
  15. Han, Junggee, Park, Minkyu, & Kim, Juntae (1998). Improving the performance of automatic text categorization by using phrasal patterns and keyword sets. Proceedings of KISS Conference (HCI), 70-73.
  16. Harmsze, F. A. P. (2000). A modular structure for scientific articles in an electronic environment. Ph. D. Dissertation. Vander Waals-Zeeman Institute. University of Amsterdam.
  17. Horn, R. (2000). Teaching philosophy with argumentation maps. Newsletter of the American Philosophical Association. November. 2000.
  18. Kando, N. (1997). Text-level structure of research articles and its implication for text-based information processing systems. Proceedings of the 19th British Computer Society Annual Colloquium on Information Retrieval Research. Aberdeen, Scotland. UK. April, 1997. 68-81.
  19. Kando, N. (1999). Text structure analysis as a tool to make retrieved documents usable. Proceedings of the 4th International Workship on Information Retrieval with Asian Languages, Taipei, Taiwan, November 11-12, 126-135.
  20. The Royal Society (2011). Knowledge, networks, and nations: global scientific collaboration in the 21st century. RS Policy Document 03/11 Retrived from https://royalsociety.org/-/media/Royal_Society_Content/policy/publications/2011/4294976134.pdf