DOI QR코드

DOI QR Code

Performance Comparison Analysis on Named Entity Recognition system with Bi-LSTM based Multi-task Learning

다중작업학습 기법을 적용한 Bi-LSTM 개체명 인식 시스템 성능 비교 분석

  • Kim, GyeongMin (Department of Computer Science and Engineering, Korea University) ;
  • Han, Seunggnyu (Department of Computer Science and Engineering, Korea University) ;
  • Oh, Dongsuk (Department of Computer Science and Engineering, Korea University) ;
  • Lim, HeuiSeok (Department of Computer Science and Engineering, Korea University)
  • 김경민 (고려대학교 컴퓨터학과) ;
  • 한승규 (고려대학교 컴퓨터학과) ;
  • 오동석 (고려대학교 컴퓨터학과) ;
  • 임희석 (고려대학교 컴퓨터학과)
  • Received : 2019.10.28
  • Accepted : 2019.12.20
  • Published : 2019.12.28

Abstract

Multi-Task Learning(MTL) is a training method that trains a single neural network with multiple tasks influences each other. In this paper, we compare performance of MTL Named entity recognition(NER) model trained with Korean traditional culture corpus and other NER model. In training process, each Bi-LSTM layer of Part of speech tagging(POS-tagging) and NER are propagated from a Bi-LSTM layer to obtain the joint loss. As a result, the MTL based Bi-LSTM model shows 1.1%~4.6% performance improvement compared to single Bi-LSTM models.

Acknowledgement

Supported by : Korea Creative Content Agency(KOCCA)

References

  1. S. Ruder. (2017). An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098.
  2. R. Caruana. (1997). Multitask learning. Machine learning, 28(1), 41-75. https://doi.org/10.1023/A:1007379606734
  3. M. Long & J. Wang. (2015). Learning multiple tasks with deep relationship networks. arXiv preprint arXiv:1506.02117, 2.
  4. Y. Zhang, Y. Wei & Q. Yang. (2018). Learning to multitask. In Advances in Neural Information Processing Systems (pp. 5771-5782).
  5. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado & J. Dean. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
  6. J. Pennington, R. Socher & C. Manning. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
  7. P. Bojanowski, E. Grave, A. Joulin & T. Mikolov. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135-146. https://doi.org/10.1162/tacl_a_00051
  8. M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee & L. Zettlemoyer. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
  9. J. Devlin, M. W. Chang, K. Lee & K. Toutanova. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 .
  10. Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov & Q. V. Le. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv preprint arXiv:1906.08237 .
  11. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen & V. Stoyanov. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  12. A. Lamurias, D. Sousa, L. A. Clarke & F. M. Couto. (2019). BO-LSTM: classifying relations via long short-term memory networks along biomedical ontologies. BMC bioinformatics, 20(1), 10. https://doi.org/10.1186/s12859-018-2584-5
  13. C. Lyu, B. Chen, Y. Ren & D. Ji. (2017). Long short-term memory RNN for biomedical named entity recognition. BMC bioinformatics, 18(1), 462. https://doi.org/10.1186/s12859-017-1868-5
  14. A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols & R. Jasper. (2018, June). Recurrent neural network language models for open vocabulary event-level cyber anomaly detection. In Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence.
  15. S. KP. (2019). RNNSecureNet: Recurrent neural networks for Cyber security use-cases. arXiv preprint arXiv:1901.04281 .
  16. G. Kim, K. Kim, J. Jo & H. Lim. (2018). Constructing for Korean Traditional culture Corpus and Development of Named Entity Recognition Model using Bi-LSTM-CNN-CRFs. Journal of the Korea Convergence Society, 9(12), 47-52. DOI : 10.15207/jkcs.2018.9.12.047 https://doi.org/10.15207/JKCS.2018.9.12.047
  17. D. Lee, W. Yu & H. Lim. (2017). Bi-directional LSTM-CNN-CRF for Korean Named Entity Recognition System with Feature Augmentation. Journal of the Korea Convergence Society, 8(12), 55-62. DOI : 10.15207/JKCS.2017.8.12.055 https://doi.org/10.15207/JKCS.2017.8.12.055