Advanced SearchSearch Tips
Korean Named Entity Recognition and Classification using Word Embedding Features
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
  • Journal title : Journal of KIISE
  • Volume 43, Issue 6,  2016, pp.678-685
  • Publisher : Korean Institute of Information Scientists and Engineers
  • DOI : 10.5626/JOK.2016.43.6.678
 Title & Authors
Korean Named Entity Recognition and Classification using Word Embedding Features
Choi, Yunsu; Cha, Jeongwon;
Named Entity Recognition and Classification (NERC) is a task for recognition and classification of named entities such as a person's name, location, and organization. There have been various studies carried out on Korean NERC, but they have some problems, for example lacking some features as compared with English NERC. In this paper, we propose a method that uses word embedding as features for Korean NERC. We generate a word vector using a Continuous-Bag-of-Word (CBOW) model from POS-tagged corpus, and a word cluster symbol using a K-means algorithm from a word vector. We use the word vector and word cluster symbol as word embedding features in Conditional Random Fields (CRFs). From the result of the experiment, performance improved 1.17%, 0.61% and 1.19% respectively for TV domain, Sports domain and IT domain over the baseline system. Showing better performance than other NERC systems, we demonstrate the effectiveness and efficiency of the proposed method.
natural language processing;named entity recognition and classification;word embedding;continuous bag-of-words model;
 Cited by
사회적 이슈 리스크 유형 분류를 위한 어휘 자질 선별,오효정;윤보현;김찬영;

정보처리학회논문지:소프트웨어 및 데이터공학, 2016. vol.5. 11, pp.541-548 crossref(new window)
DM. Bikel, S. Miller, R. Schwartz, R. Weischedel, "Nymble: a High-Performance Learning Namefinder," Proc. of the 5th Conference on Applied Natural Language Processing, pp. 194-201, 1997.

X. Liu, M. Zhou, F. Wei, Z. Fu and X. Zhou, "Joint Inference of Named Entity Recognition and Normalization for Tweets," Proc. of the 50th Annual Meeting of the Association for Computational Linguistics, Vol. 1, pp. 526-535, 2012.

E. Chung, H. Lee, Y. Hwang and B. Yun, "Korean Name Entity Detection using Co-Training Methods," Proc. of the Human Computer Interaction 2003, pp. 1289-1293, 2003.

C. Lee, et al., "Fine-Grained Named Entity Recognition using Conditional Random Fields for Question Answering," Proc. of the 18th Annual Conference on Human & Cognitive Language Technology, pp. 268-272, 2006.

C. Lee and M. Jang, "Named Entity Recognition with Structural SVMs and Pegasos algorithm," Journal of The Korean Society for Cognitive Science, Vol. 21, No. 4, pp. 655-667. Dec. 2010. crossref(new window)

C. Lee, J. Kim, J. Kim and H. Kim, "Named Entity Recognition using Deep Learning," Proc. of the 41th KIISE Winter Conference, pp. 423-425, 2014.

S. Bae and Y. Ko, "Automatic Construction of Class Hierarchies and Named Entity Dictionaries using Korean Wikipedia," Journal of KIISE : Computing Practices and Letters, Vol. 16, No. 4, pp. 492-496, Apr. 2010.

Y. Song, S. Jeong and H. Kim, "A Constructing Method of Named Entity Dictionary using Wikipedia Based on Information Retrieval Method," Proc. of the KIISE Korea Computer Congress 2015, pp. 648-650, 2015.

J. Turian, L. Ratinov and Y. Bengio, "Word representations: A simple and general method for semisupervised learning," Proc. of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384-394, 2010.

Y. Bengio, R. Ducharme, P. Vincent and C. Jauvin, "A Neural Probabilistic Language Model," Journal of Machine Learning Research, Vol. 3, pp. 1137-1155, 2003.

T. Mikolov, K. Chen, G. Corrado and J. Dean, "Efficient Estimation of Word Representations in Vector Space," ICLR Workshop, 2013.

J. Hong and J. Cha, "A New Korean Morphological Analyzer using Eojeol Pattern dictionary," Proc. of the KIISE Korea Computer Congress 2008, pp. 279-284, 2008.