DOI QR코드

DOI QR Code

KI-HABS: Key Information Guided Hierarchical Abstractive Summarization

  • Zhang, Mengli (State Key Laboratory of Mathematical Engineering and Advanced Computing) ;
  • Zhou, Gang (State Key Laboratory of Mathematical Engineering and Advanced Computing) ;
  • Yu, Wanting (State Key Laboratory of Mathematical Engineering and Advanced Computing) ;
  • Liu, Wenfen (School of Computer Science and Information Security, Guangxi Key Laboratory of Cryptogpraphy and Information Security, Guilin University of Electronic Technology)
  • Received : 2020.11.02
  • Accepted : 2021.11.24
  • Published : 2021.12.31

Abstract

With the unprecedented growth of textual information on the Internet, an efficient automatic summarization system has become an urgent need. Recently, the neural network models based on the encoder-decoder with an attention mechanism have demonstrated powerful capabilities in the sentence summarization task. However, for paragraphs or longer document summarization, these models fail to mine the core information in the input text, which leads to information loss and repetitions. In this paper, we propose an abstractive document summarization method by applying guidance signals of key sentences to the encoder based on the hierarchical encoder-decoder architecture, denoted as KI-HABS. Specifically, we first train an extractor to extract key sentences in the input document by the hierarchical bidirectional GRU. Then, we encode the key sentences to the key information representation in the sentence level. Finally, we adopt key information representation guided selective encoding strategies to filter source information, which establishes a connection between the key sentences and the document. We use the CNN/Daily Mail and Gigaword datasets to evaluate our model. The experimental results demonstrate that our method generates more informative and concise summaries, achieving better performance than the competitive models.

Keywords

References

  1. Li W, Xiao X, Lyu Y, et al, "Improving neural abstractive document summarization with structural regularization," in Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4078-4087, 2018.
  2. Cheng J, Lapata M, "Neural summarization by extracting sentences and words," in Proc. of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 484-494, 2016.
  3. Jadhav A, Rajan V, "Extractive summarization with swap-net: sentences and words from alternating pointer networks," in Proc. of the 56th annual meeting of the association for computational linguistics, pp. 142-151, 2018.
  4. Dong Y, Shen Y, Crawford E, et al., "BanditSum: extractive summarization as a contextual bandit," in Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3739-3748, 2018.
  5. Gao Y, Wang Y, Liu L, et al., "Neural abstractive summarization fusing by global generative topics," Neural Computing and Applications, 32, 5049-5058, 2020. https://doi.org/10.1007/s00521-018-3946-7
  6. Zheng J, Zhao Z and Song Z et al., "Abstractive meeting summarization by hierarchical adaptive segmental network learning with multiple revising steps," Neurocomputing, 378, 79-188. https://doi.org/10.1016/j.neucom.2019.10.018
  7. Li S, Xu, "A two-step abstractive summarization model with asynchronous and enriched-information decoding," Neural Computing and Applications, 33, 1159-1170, 2021. https://doi.org/10.1007/s00521-020-05005-3
  8. Liang Z, Du J, Li C, "Abstractive Social Media Text Summarization using Selective Reinforced Seq2Seq Attention Model," Neurocomputing, 410, 432-440, Oct. 2020. https://doi.org/10.1016/j.neucom.2020.04.137
  9. Deng Z, Ma F and Lan R et al., "A Two-stage Chinese text summarization algorithm using keyword information and adversarial learning," Neurocomputing, vol. 425, pp. 117-126, 2021. https://doi.org/10.1016/j.neucom.2020.02.102
  10. Rush A M, Chopra S, Weston J, "A neural attention model for abstractive sentence summarization," in Proc. of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379-389, 2015.
  11. Takase S, Suzuki J, Okazaki N, et al., "Neural headline generation on abstract meaning representation," in Proc. of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1054-1059, 2016.
  12. Chen Q, Zhu X D, Ling Z H, et al., "Distraction-based neural networks for modeling documents," in Proc. of 2016 International Joint Conference on Artificial Intelligence, pp. 2754-2760, 2016.
  13. See A, Liu P J, Manning C D, "Get to the point: summarization with pointer-generator networks," in Proc. of the 2017 Annual Meeting of the Association for Computational Linguistics, pp. 1073-1083, 2017.
  14. Tan J, Wan X, Xiao J, "Abstractive document summarization with a graph-based attentional neural model," in Proc. of the 2017 Annual Meeting of the Association for Computational Linguistics, pp. 1171-1181, 2017.
  15. Zhou Q, Yang N, Wei F, et al., "Selective encoding for abstractive sentence summarization," in Proc. of the 2017 Annual Meeting of the Association for Computational Linguistics, pp. 1095-1104, 2017.
  16. Narayan S, Cohen S B, Lapata M, "Don't give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization," in Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 1797-1807, 2018.
  17. Lebanoff L, Song K, Liu F, "Adapting the neural encoder-decoder framework from single to multi-document summarization," in Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4131-4141, 2018.
  18. Zhu J, Wang Q, Wang Y, et al., "NCLS: neural cross-lingual summarization," in Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing, pp. 3045-3055, 2019.
  19. Li H, Zhu J, Zhang J, et al., "Keywords-guided abstractive sentence summarization," in Proc. of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 5, pp. 8196-8203, 2020.
  20. Sutskever I, Vinyals O, Le Q V, "Sequence to sequence learning with neural networks," in Proc. of the 27th International Conference on Neural Information Processing Systems, pp. 3104-3112, 2014.
  21. Nallapati R, Zhou B, Santos C N D, et al., "Abstractive text summarization using sequence-to-sequence rnns and beyond," in Proc. of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 280-290, 2014.
  22. Li W, Xiao X, Lyu Y, et al., "Improving neural abstractive document summarization with structural regularization," in Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4078-4087, 2018.
  23. Lin R, Liu S, Yang M, et al., "Hierarchical recurrent neural network for document modeling," in Proc. of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 899-907, 2015.
  24. Yang Z, Yang D, Dyer C, et al., "Hierarchical attention networks for document classification," in Proc. of the 2016 conference of the North American chapter of the association for computational linguistics, pp. 1480-1489, 2016.
  25. Li J, Luong M T, Jurafsky D, "A hierarchical neural autoencoder for paragraphs and documents," in Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1106-1115, 2015.
  26. Nallapati R, Zhai F, Zhou B, "SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents," in Proc. of the 31th AAAI Conference on Artificial Intelligence, pp. 3075-3081, 2016.
  27. Chen Y C, Bansal M, "Fast abstractive summarization with reinforce selected sentence rewriting," in Proc. of the 2018 Annual Meeting of the Association for Computational Linguistics, pp. 675-686, 2018.
  28. Cao Z, Li W, Li S, et al., "Retrieve, rerank and rewrite: soft template based neural summarization," in Proc. of the 2018 Annual Meeting of the Association for Computational Linguistics, pp. 152-161, 2018.
  29. Chopra S, Auli M, Rush A M, "Abstractive sentence summarization with attentive recurrent neural networks," in Proc. of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93-98, 2016.
  30. Gu J, Lu Z, Li H, et al., "Incorporating copying mechanism in sequence-to-sequence learning," in Proc. of the 2016 Annual Meeting of the Association for Computational Linguistics, pp. 1631-1640, 2016.
  31. Paulus R, Xiong C, Socher R., "A deep reinforced model for abstractive summarization," in Proc. of the 2018 International Conference on Learning Representations, 2017.
  32. Tan J, Wan X, Xiao J, "From neural sentence summarization to headline generation: a coarse-to-fine approach," in Proc. of the 2017 International Joint Conference on Artificial Intelligence, pp. 4109-4115, 2017.
  33. Gehrmann S, Deng Y, Rush A M, "Bottom-up abstractive summarization," in Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4098-4109, 2018.
  34. Dlikman A, Last M, "Using machine learning methods and linguistic features in single-document extractive summarization," in Proc. of DMNLP@ PKDD/ECML, pp. 1-8, 2016.
  35. R. Nallapati, B. Zhou, M. Ma, "Classify or select: neural architectures for extractive document summarization," arXiv preprint, arXiv: 1611.04244, 2016.
  36. Cohan A, Dernoncourt F, Kim D S, et al., "A discourse-aware attention model for abstractive summarization of long documents," in Proc. of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 615-621, 2018.
  37. Li C, Xu W, Li S, et al., "Guiding generation for abstractive text summarization based on key information guide network," in Proc. of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 55-60, 2018.
  38. Pasunuru R, Bansal M, "Multi-reward reinforced summarization with saliency and entailment," in Proc. of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 646-653, 2018.
  39. Sankaran B, Mi H, Onaizan Y A, et al., "Temporal attention model for neural machine translation," Preprint, arXiv:1608.02927, 2016.
  40. Duchi J, Hazan E, Singer Y, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of machine learning research, vol. 12, pp. 2121-2159, 2011.
  41. Napoles C, Gormley M, Durme B V, "Annotated Gigaword," in Proc. of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction, pp. 95-100, 2012.
  42. Shen X, Zhao Y, Su H, et al., "Improving latent alignment in text summarization by generalizing the pointer generator," in Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 3753-3764, 2019.
  43. Xu S, Li H, Yuan P, et al., "Self-Attention Guided Copy Mechanism for Abstractive Summarization," in Proc. of the 58th Annual Meeting of the Association for Computational Linguistics, pp.1355-1362, 2020.