DOI QR코드

DOI QR Code

Training Techniques for Data Bias Problem on Deep Learning Text Summarization

딥러닝 텍스트 요약 모델의 데이터 편향 문제 해결을 위한 학습 기법

  • Cho, Jun Hee (Web Progamming, Korea Digital Media High School) ;
  • Oh, Hayoung (College of Computing and Informatics, Sungkyunkwan University)
  • Received : 2022.05.03
  • Accepted : 2022.06.10
  • Published : 2022.07.31

Abstract

Deep learning-based text summarization models are not free from datasets. For example, a summarization model trained with a news summarization dataset is not good at summarizing other types of texts such as internet posts and papers. In this study, we define this phenomenon as Data Bias Problem (DBP) and propose two training methods for solving it. The first is the 'proper nouns masking' that masks proper nouns. The second is the 'length variation' that randomly inflates or deflates the length of text. As a result, experiments show that our methods are efficient for solving DBP. In addition, we analyze the results of the experiments and present future development directions. Our contributions are as follows: (1) We discovered DBP and defined it for the first time. (2) We proposed two efficient training methods and conducted actual experiments. (3) Our methods can be applied to all summarization models and are easy to implement, so highly practical.

일반적인 딥러닝 기반의 텍스트 요약 모델은 데이터셋으로부터 자유롭지 않다. 예를 들어 뉴스 데이터셋으로 학습한 요약 모델은 커뮤니티 글, 논문 등의 종류가 다른 글에서 핵심을 제대로 요약해내지 못한다. 본 연구는 이러한 현상을 '데이터 편향 문제'라 정의하고 이를 해결할 수 있는 두 가지 학습 기법을 제안한다. 첫 번째는 고유명사를 마스킹하는 '고유명사 마스킹'이고 두 번째는 텍스트의 길이를 임의로 늘이거나 줄이는 '길이 변화'이다. 또한, 실제 실험을 진행하여 제안 기법이 데이터 편향 문제 해결에 효과적임을 확인하며 향후 발전 방향을 제시한다. 본 연구의 기여는 다음과 같다. 1) 데이터 편향 문제를 정의하고 수치화했다. 2) 요약 데이터의 특징을 바탕으로 학습 기법을 제안하고 실제 실험을 진행했다. 3) 제안 기법은 모든 요약 모델에 적용할 수 있고 구현이 어렵지 않아 실용성이 뛰어나다.

Keywords

Acknowledgement

This work was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(2022R1F1A1074696).

References

  1. C. Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries," in Proceedings of the Workwhop on Text Summarization Branches Out, Barcelona, Spain, pp. 74-81, 2004.
  2. J. Wei and K. Zou, "EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks," in Proceeding of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, pp. 6382-6388, 2019. DOI:10.18653/v1/D19-1670.
  3. Z. Liu, J. Li, and M. Zhu, "Improving Text Generation with Dynamic Masking and Recovering," in International Joint Conference on Artificial Intelligence, Online, pp. 3878-3884, 2021. DOI: 10.24963/ijcai.2021/534.
  4. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," in 31st Conference on Neural Information Processing Systems, Long Beach: CA, USA, 2017.
  5. I. Cachola, K. Lo, A. Cohan, and D. Weld, "TLDR: Extreme Summarization of Scientific Documents," in Findings of the Association for Computational Linguistics: EMNLP 2020, Online, pp. 4766-4777, 2020. DOI: 10.18653/v1/2020.findings-emnlp.428.
  6. S. Narayan, S. B. Cohen, and M. Lapata, "Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization," in Proceeding of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 1797-1807, 2018. DOI: 10.18653/v1/D18-1206.
  7. B. Kim, H. Kim, and G. Kim, "Abstractive Summarization of Reddit Posts with Multi-level Memory Networks," in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis: MN, USA, pp. 2519-2531, 2019. DOI: 10.18653/v1/N19-1260.
  8. M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp. 7871-7880, 2020. DOI: 10.18653/v1/2020.acl-main.703.
  9. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer," Journal of Machine Learning Research, vol. 21, pp. 1-67, Jun. 2020.