Named Entity Recognition Using Distant Supervision and Active Bagging
  • Journal title : Journal of KIISE
  • Volume 43, Issue 2,  2016, pp.269-274
  • Publisher : Korean Institute of Information Scientists and Engineers
  • DOI : 10.5626/JOK.2016.43.2.269
Lee, Seong-hee; Song, Yeong-kil; Kim, Hark-soo;
Named entity recognition is a process which extracts named entities in sentences and determines categories of the named entities. Previous studies on named entity recognition have primarily been used for supervised learning. For supervised learning, a large training corpus manually annotated with named entity categories is needed, and it is a time-consuming and labor-intensive job to manually construct a large training corpus. We propose a semi-supervised learning method to minimize the cost needed for training corpus construction and to rapidly enhance the performance of named entity recognition. The proposed method uses distance supervision for the construction of the initial training corpus. It can then effectively remove noise sentences in the initial training corpus through the use of an active bagging method, an ensemble method of bagging and active learning. In the experiments, the proposed method improved the F1-score of named entity recognition from 67.36% to 76.42% after active bagging for 15 times.
named entity recognition;distant supervision;ensemble;active bagging;
