A Semi-automatic Construction method of a Named Entity Dictionary Based on Wikipedia
  • Journal title : Journal of KIISE
  • Volume 42, Issue 11,  2015, pp.1397-1403
  • Publisher : Korean Institute of Information Scientists and Engineers
  • DOI : 10.5626/JOK.2015.42.11.1397
Song, Yeongkil; Jeong, Seokwon; Kim, Harksoo;
A named entity(NE) dictionary is an important resource for the performance of NE recognition. However, it is not easy to construct a NE dictionary manually since human annotation is time consuming and labor-intensive. To save construction time and reduce human labor, we propose a semi-automatic system for the construction of a NE dictionary. The proposed system constructs a pseudo-document with Wiki-categories per NE class by using an active learning technique. Then, it calculates similarities between Wiki entries and pseudo-documents using the BM25 model, a well-known information retrieval model. Finally, it classifies each Wiki entry into NE classes based on similarities. In experiments with three different types of NE class sets, the proposed system showed high performance(macro-average F1-score of 0.9028 and micro-average F1-score 0.9554).
named entity dictionary construction;Wikipedia;information retrieval method;active learning;
