JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Recognition of Korean Implicit Citation Sentences Using Machine Learning with Lexical Features
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Recognition of Korean Implicit Citation Sentences Using Machine Learning with Lexical Features
Kang, In-Su;
  PDF(new window)
 Abstract
Implicit citation sentence recognition is to locate citation sentences which lacks explicit citation markers, from articles' full-text. State-of-the-art approaches exploit word ngrams, clue words, researcher's surnames, mentions of previous methods, and distance relative to nearest explicit citation sentences, etc., reaching over 50% performance. However, most previous works have been conducted on English. As for Korean, a rule-based method using positive/negative clue patterns was reported to attain the performance of 42%, requiring further improvement. This study attempted to learn to recognize implicit citation sentences from Korean literatures' full-text using Korean lexical features. Different lexical feature units such as Eojeol, morpheme, and Eumjeol were evaluated to determine proper lexical features for Korean implicit citation sentence recognition. In addition, lexical features were combined with the position features representing backward/forward proximities to explicit citation sentences, improving the performance up to over 50%.
 Keywords
Citation Sentence;Korean Lexical Feature;Machine Learning;
 Language
Korean
 Cited by
 References
1.
H. Nanba, N. Kando, M. Okumura, "Classification of research papers using citation links and citation types: Towards automatic review article generation", Proc. of the 11th ASIS SIG/CR Classification Research Workshop, pp.117-134, 2000.

2.
A. Ritchie, S. Robertson, S. Teufel, "Comparing citation contexts for information retrieval", Proc. of the 17th ACM Conference on Information and Knowledge Management, pp.213-222, 2008. DOI: http://dx.doi.org/10.1145/1458082.1458113 crossref(new window)

3.
D. Kaplan, R. Iida, T. Tokunaga, "Automatic extraction of citation contexts for research paper summarization: a coreference-chain based approach", Proc. of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, pp.88-95, 2009. DOI: http://dx.doi.org/10.3115/1699750.1699764 crossref(new window)

4.
P. Sondhi, C. Zhai, "A constrained hidden Markov model approach for non-explicit citation context extraction", Proc. of the 2014 SIAM International Conference on Data Mining, pp.361-369, 2014. DOI: http://dx.doi.org/10.1137/1.9781611973440.41 crossref(new window)

5.
I. Kang, "A rule-based approach to identifying citation text from Korean academic literature", Journal of the Korean Society for information Management, 29(4), pp.43-60, 2012. DOI: http://dx.doi.org/10.3743/kosim.2012.29.4.043 crossref(new window)

6.
V. Qazvinian, D. R. Radev, "Identifying non-explicit citing sentences for citation-based summarization", Proc. of the 48th Annual Meeting of the Association for Computational Linguistics, pp.555-564, 2010.

7.
A. Athar, S. Teufel, "Detection of implicit citations for sentiment detection", Proc. of ACL-12 Workshop on Discovering Structure in Scholarly Discourse, pp.18-26, 2012.

8.
A. Abu-Jbara, J. Ezra, D. R. Radev, "Purpose and polarity of citation: towards NLP-based bibliometrics", Proc. of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.596-606, 2013.

9.
C-C Chang, C-J Lin, "LIBSVM : a library for support vector machines", ACM Transactions on Intelligent Systems and Technology, 2(3):27:1-27:27, 2011. Software available at http://www.csie.ntu.edu.tw/-cjlin/ libsvm DOI: http://dx.doi.org/10.1145/1961189.1961199 crossref(new window)