DOI QR코드

DOI QR Code

A Character Identification Method using Postpositions for Animate Nouns in Korean Novels

한국어 소설에서 유정명사용 조사 기반의 인물 추출 기법

  • 박태근 (단국대학교 응용컴퓨터공학과) ;
  • 김승훈 (단국대학교 응용컴퓨터공학과)
  • Received : 2016.06.20
  • Accepted : 2016.07.26
  • Published : 2016.09.30

Abstract

Novels includes various character names, depending on the genre and the spatio-temporal background of the novels and the nationality of characters. Besides, characters and their names in a novel are created by the author's pen and imagination. As a result, any proper noun dictionary cannot include all kind of character names which have been created or will be created by authors. In addition, since Korean does not have capitalization feature, character names in Korean are harder to detect than those in English. Fortunately, however, Korean has postpositions, such as "-ege" and "hante", used by a sentient being or an animate object (noun). We call such postpositions as animate postpositions in this paper. In a previous study, the authors manually selected character names by referencing both Wikipedia and well-known people dictionaries after utilizing Korean morpheme analyzer, a proper noun dictionary, postpositions (e.g., "-ga", "-eun", "-neun", "-eui", and "-ege"), and titles (e.g., "buin"), in order to extract social networks from three novels translated into or written in Korean. But, the precision, recall, and F-measure rates of character identification are not presented in the study. In this paper, we evaluate the quantitative contribution of animate postpositions to character identification from novels, in terms of precision, recall, and F-measure. The results show that utilizing animate postpositions is a valuable and powerful tool in character identification without a proper noun dictionary from novels translated into or written in Korean.

Keywords

References

  1. Elson, D.K. and K.R. McKeown, "Automatic Attiribution of Quoted Speech in Literary Narrative", Procedings of the 24th AAAI Conference on Artificial Intelligence, 2010, 1013-1019.
  2. Elson, D.K., N. Dames, and K.R. McKwown, "Extracting Social Networks from Literary Fiction", Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010, 138-147.
  3. Iosif, E. and T. Mishra, "From Speaker Identification to Affective Analysis : A Multi-Step System for Analyzing Children' Stories", the 3rd Workshop on Computational Linguistics for Literature, 2014, 40-49.
  4. Jeong, H., "A Cognitive Semantic Approach to Korean Particle Eygey", Discourse and Cognition, Vol.19, No.2, 2012, 133-152. (정해권, "한국어 조사 '에게'의 인지의미론적 접근", 담화와인지, 제19권, 제2호, 2012, 133-152.)
  5. Kucuk, D. and Y. Adnan, "A Hybrid Named Entity Recognizer for Turkish", Expert Systems with Applications, Vol.39, No.3, 2012, 2733-2742. https://doi.org/10.1016/j.eswa.2011.08.131
  6. Lee, D.J., J.H. Yeon, I.B. Hwang, and S.G. Lee, "KKMA : A Tool for Utilizing Sejong Corpus based on Relational Database", Journal of KIISE : Computing Practices and Letters, Vol.16, No.11, 2010, 1046-1050. (이동주, 연종흠, 황인범, 이상구, "꼬꼬마 : 관계형 데이터베이스를 활용한 세종 말뭉치 활용 도구", 정보과학회논문지 : 컴퓨팅의 실제 및 레터, 제16권, 제11호, 1046-1050.)
  7. Lee, E.Y., "Named Entity Detection and Relation Extraction in the Personal Chronology of the 19th Century", Journal of EONEIHAG, Vol.53, 2009, 141-162. (이은령, "19세기 문헌 국역본의 개체명 인식 및 관계 추출을 위한 기초 연구", 언어학, Vol.53, 2009, 141-162.)
  8. Nadeau, D. and S. Kekine, "A Survey of Named Entity Recognition and Classification", Lingvisticae Investigationes, Vol.30, No.1, 2007, 3-26. https://doi.org/10.1075/li.30.1.03nad
  9. Park, G.M., S.H. Kim, and H.G. Cho, "Analysis of Social Network According to the Distance of Character Statements", Journal of the Korea Contents Association, Vol.13, No.4, 2013, 427-439. (박경미, 김성환, 조환규, "소설 등장인물의 텍스트 거리를 이용한 사회 구성망 분석", 한국콘텐츠학회논문지, 제13권, 제4호, 2013, 427-439.)
  10. Seon, C.N., Y. Ko, J.S. Kim, and J. Seo, "Named Entity Recognition using Machine Learning Methods and Pattern-Selection Rules", In Proceedings of the 6th Natural Language Processing Pacific Rim Symposium, 2001, 229-236.
  11. Shaalan, K. and M. Oudah, "A Hybrid Approach to Arabic Named Entity Recognition", Journal of Information Science, Vol.40, No.1, 2014, 67-87. https://doi.org/10.1177/0165551513502417