Document Summarization via Convex-Concave Programming

Kim, Minyoung;

doi:10.5391/IJFIS.2016.16.4.293

International Journal of Fuzzy Logic and Intelligent Systems

Volume 16 Issue 4
/
Pages.293-298
/
2016
/
1598-2645(pISSN)
/
2093-744X(eISSN)

Korean Institute of Intelligent Systems (한국지능시스템학회)

DOI QR Code

Document Summarization via Convex-Concave Programming

Kim, Minyoung (Department of Electronics & IT Media Engineering, Seoul National University of Science & Technology)

Received : 2016.11.21
Accepted : 2016.12.13
Published : 2016.12.12

https://doi.org/10.5391/IJFIS.2016.16.4.293 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Document summarization is an important task in various areas where the goal is to select a few the most descriptive sentences from a given document as a succinct summary. Even without training data of human labeled summaries, there has been several interesting existing work in the literature that yields reasonable performance. In this paper, within the same unsupervised learning setup, we propose a more principled learning framework for the document summarization task. Specifically we formulate an optimization problem that expresses the requirements of both faithful preservation of the document contents and the summary length constraint. We circumvent the difficult integer programming originating from binary sentence selection via continuous relaxation and the low entropy penalization. We also suggest an efficient convex-concave optimization solver algorithm that guarantees to improve the original objective at every iteration. For several document datasets, we demonstrate that the proposed learning algorithm significantly outperforms the existing approaches.

Keywords

References

H. P. Luhn, "The automatic creation of literature abstracts," IBM Journal of Research and Development, vol. 2, no. 2, pp. 159-165, 1958. http://dx.doi.org/10.1147/rd.22.0159
C. Y. Lin and E. Hovy, "The automated acquisition of topic signatures for text summarization," in Proceedings of the 18th Conference on Computational Linguistics, Saarbrucken, Germany, 2000, pp. 495-501. http://dx.doi.org/10.3115/990820.990892
J. Kupiec, J. Pedersen, and F. Chen, "A trainable document summarizer," in Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development In Information Retrieval, Seattle, WA, 1995, pp. 68-73. http://dx.doi.org/10.1145/215206.215333
J. M. Conroy and D. P. O'leary, "Text summarization via hidden Markov models," in Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA, 2001, pp. 406-407. http://dx.doi.org/10.1145/383952.384042
M. Osborne, "Using maximum entropy for sentence extraction," in Proceedings of the ACL-02 Workshop on Automatic Summarization, Philadelphia, PA, 2002, pp. 1-8. http://dx.doi.org/10.3115/1118162.1118163
K. M. Svore, L. Vanderwende, and C. J. C. Burges, "Enhancing single-document summarization by combining RankNet and third-party sources," in Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech, 2007, pp. 448-457.
H. Lin and J. Bilmes, "A class of submodular functions for document summarization," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, 2011, pp. 510-520.
A. L. Yuille and A. Rangarajan, "The concave-convex procedure," Neural Computation, vol. 15, no. 4, pp. 915-936, 2003. http://dx.doi.org/10.1162/08997660360581958
Y. Ye, Interior point algorithms: theory and analysis. New York: John Wiley & Sons, 1997.
C. Y. Lin and E. Hovy, "Automatic evaluation of summaries using N-gram co-occurrence statistics," in Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada, 2003, pp. 71-78. http://dx.doi.org/10.3115/1073445.1073465

International Journal of Fuzzy Logic and Intelligent Systems

Document Summarization via Convex-Concave Programming

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)