Multi-task learning with contextual hierarchical attention for Korean coreference resolution

Cheoneum Park;

doi:10.4218/etrij.2021-0293

ETRI Journal

제45권1호
/
Pages.93-104
/
2023
/
1225-6463(pISSN)
/
2233-7326(eISSN)

한국전자통신연구원 (Electronics and Telecommunications Research Institute)

DOI QR Code

Multi-task learning with contextual hierarchical attention for Korean coreference resolution

Cheoneum Park (AIRS Company, Hyundai Motor Group)

투고 : 2021.08.25
심사 : 2022.08.29
발행 : 2023.02.20

https://doi.org/10.4218/etrij.2021-0293 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Coreference resolution is a task in discourse analysis that links several headwords used in any document object. We suggest pointer networks-based coreference resolution for Korean using multi-task learning (MTL) with an attention mechanism for a hierarchical structure. As Korean is a head-final language, the head can easily be found. Our model learns the distribution by referring to the same entity position and utilizes a pointer network to conduct coreference resolution depending on the input headword. As the input is a document, the input sequence is very long. Thus, the core idea is to learn the word- and sentence-level distributions in parallel with MTL, while using a shared representation to address the long sequence problem. The suggested technique is used to generate word representations for Korean based on contextual information using pre-trained language models for Korean. In the same experimental conditions, our model performed roughly 1.8% better on CoNLL F1 than previous research without hierarchical structure.

키워드

참고문헌

V. Ng and C. Cardie, Identifying anaphoric and non-anaphoric noun phrases to improve coreference resolution, (Proceedings of the 19th International Conference on Computational Linguistics, Stroudsburg, PA, USA), 2002. https://doi.org/10.3115/1072228.1072367
W. M. Soon, H. T. Ng, and D. C. Y. Lim, A machine learning approach to coreference resolution of noun phrases, Comput. Ling. 27 (2001), no. 4, 521-544. https://doi.org/10.1162/089120101753342653
K. Lee, L. He, M. Lewis, and L. Zettlemoyer, End-to-end neural coreference resolution, (Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark), 2017, pp. 188-197. https://doi.org/10.18653/v1/D17-1018
K. Lee, L. He, and L. Zettlemoyer, Higher-order coreference resolution with coarse-to-fine inference, (Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 2 (short papers), New Orleans, LA, USA), 2018, pp. 687-692. https://doi.org/10.18653/v1/N18-2108
D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, (3rd International Conference on Learning Representations), 2015. https://doi.org/10.48550/arXiv.1409.0473
O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, in Advances in neural information processing systems, 2015, pp. 2692-2700.
K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, (Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar), 2014, pp. 1724-1734. https://doi.org/10.3115/v1/D14-1179
R. Caruana, Multitask learning, Learning to learn, S. Thrun and L. Y. Pratt, (eds.), Springer, 1998, pp. 95-133. https://doi.org/10.1007/978-1-4615-5529-2_5
S. Ruder, An overview of multi-task learning in deep neural networks, arXiv preprint, 2017. https://doi.org/10.48550/arXiv.1706.05098
H. Lee, A. Chang, Y. Peirsman, N. Chambers, M. Surdeanu, and D. Jurafsky, Deterministic coreference resolution based on entity-centric, precision-ranked rules, Comput. Ling. 39 (2013), no. 4, 885-916. https://doi.org/10.1162/COLI_a_00152
C. Park, K.-H. Choi, C. Lee, and S. Lim, Korean coreference resolution with guided mention pair model using deep learning, ETRI J. 38 (2016), no. 6, 1207-1217. https://doi.org/10.4218/etrij.16.0115.0896
M. A. Ur Rahman and V. Ng, Supervised models for coreference resolution, (Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore), 2009, pp. 968-977. https://www.aclweb.org/anthology/D09-1101/
C. Lee, S. Jung, and C.-E. Park, Anaphora resolution with pointer networks, Pattern Recognit. Lett. 95 (2017), 1-7. https://doi.org/10.1016/j.patrec.2017.05.015
K. Clark and C. D. Manning, Improving coreference resolution by learning entity-level distributed representations, (Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany) 2016, pp. 643-653. https://doi.org/10.18653/v1/P16-1061
K. Clark and C. D. Manning, Deep reinforcement learning for mention-ranking coreference models, (Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Austin, TX, USA), 2016, pp. 2256-2262. https://doi.org/10.18653/v1/D16-1245
J. Li, M.-T. Luong, and D. Jurafsky, A hierarchical neural autoencoder for paragraphs and documents, (Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Beijing, China), 2015, pp. 1106-1115. https://doi.org/10.3115/v1/P15-1107
I. V. Serban, A. Sordoni, Y. Bengio, A. C. Courville, and J. Pineau, Building end-to-end dialogue systems using generative hierarchical neural network models, (Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA), 2016, pp. 3776-3784.
A. Sordoni, Y. Bengio, H. Vahabi, C. Lioma, J. G. Simonsen, and J.-Y. Nie, A hierarchical recurrent encoder-decoder for generative context-aware query suggestion, (Proceedings of the 24th ACM International Conference on Information and Knowledge Management, New York, NY, USA), 2015, pp. 553-562. https://doi.org/10.1145/2806416.2806493
R. Lin, S. Liu, M. Yang, M. Li, M. Zhou, and S. Li, Hierarchical recurrent neural network for document modeling, (Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal), 2015, pp. 899-907. https://doi.org/10.18653/v1/d15-1106
R. Nallapati, B. Zhou, C. N. Dos Santos, C. Gulcehre, and B. Xiang, Abstractive text summarization using sequence-to-sequence rnns and beyond, (Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, Germany), 2016, pp. 280-290. https://doi.org/10.18653/v1/k16-1028
B. McCann, N. S. Keskar, C. Xiong, and R. Socher, The natural language decathlon: Multitask learning as question answering, arXiv preprint, 2018. http://arxiv.org/abs/1806.08730
Y. Xu, X. Liu, Y. Shen, J. Liu, and J. Gao, Multi-task learning with sample re-weighting for machine reading comprehension, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA), 2019, pp. 2644-2655. https://doi.org/10.18653/v1/n19-1271
M.-T. Luong, Q. V. Le, I. Sutskever, O. Vinyals, and L. Kaiser, Multi-task sequence to sequence learning, (4th International Conference on Learning Representations), 2016. http://arxiv.org/abs/1511.06114
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pretraining of deep bidirectional transformers for language understanding, (Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA), 2019, pp. 4171-4186. https://doi.org/10.18653/v1/N19-1423
K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning, ELECTRA: pre-training text encoders as discriminators rather than generators, (8th International Conference on Learning Representations), 2020. https://openreview.net/forum?id%3Dr1xMH1BtvB
W. Wang, N. Yang, F. Wei, B. Chang, and M. Zhou, Gated self-matching networks for reading comprehension and question answering, (Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada), 2017, pp. 189-198.
T. Dozat and C. D. Manning, Deep biaffine attention for neural dependency parsing, arXiv preprint, 2016. https://doi.org/10.48550/arXiv.1611.01734
T. Lei, Y. Zhang, S. I. Wang, H. Dai, and Y. Artzi, Simple recurrent units for highly parallelizable recurrence, (Proceedings of the 2018 Conference on Eempirical Methods in Natural Language Processing, Brussels, Belgium), 2018, pp. 4470-4481. https://doi.org/10.18653/v1/D18-1477
D.-A. Clevert, T. Unterthiner, and S. Hochreiter, Fast and accurate deep network learning by exponential linear units (ELU s), arXiv preprint, 2015. https://doi.org/10.48550/arXiv.1511.07289
G. E. Nasr, E. A. Badr, and C. Joun, Cross entropy error function in neural networks: Forecasting gasoline demand, in Proceedings of the fifteenth international florida artificial intelligence research society conference, AAAI Press, 2002, pp. 381-384. http://dl.acm.org/citation.cfm?id%3D646815.708603
C. Park, C. Lee, J. Ryu, and H. Kim, Contextualized embedding-and character embedding-based pointer network for Korean coreference resolution, In Proceedings of the 30th Annual Conference on Human and Cognitive Language Technology, 2018, pp. 239-242.
M. Joshi, O. Levy, L. Zettlemoyer, and D. Weld, BERT for coreference resolution: Baselines and analysis, (Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China), 2019, pp. 5803-5808. https://doi.org/10.18653/v1/D19-1588
C. Park, J. Shin, S. Park, J. Lim, and C. Lee, Fast end-to-end coreference resolution for Korean, (Findings of the Association for Computational Linguistics: EMNLP, Online), 2020, pp. 2610-2624. https://doi.org/10.18653/v1/2020.findingsemnlp.237
C. Park, J. Lim, J. Ryu, H. Kim, and C. Lee, Simple and effective neural coreference resolution for korean language, ERI J. 43, (2021), 1038-1048. https://doi.org/10.4218/etrij.2020-0282
D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint, 2014. https://doi.org/10.48550/arXiv.1412.6980

ETRI Journal

Multi-task learning with contextual hierarchical attention for Korean coreference resolution

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)