DOI QR코드

DOI QR Code

Email Extraction and Utilization for Author Disambiguation

저자 식별을 위한 전자메일의 추출 및 활용

  • 강인수 (경성대학교 컴퓨터정보학부)
  • Published : 2008.06.28

Abstract

An author of a paper is represented as his/her personal name in a bibliographic record. However, the use of names to indicate authors may deteriorate recall and precision of paper and/or author search, since the same name can be shared by many different individuals and a person can write his/her name in different forms. To solve this problem, it is required to disambiguate same-name author names into different persons. As features for author resolution, previous studies have exploited bibliographic attributes such as co-authors, titles, publication information, etc. This study attempts to apply email addresses of authors to disambiguate author names. For this, we first handle the extraction of email addresses from full-text papers, and then evaluate and analyze the effect of email addresses on author resolution using a large-scale test set.

Keywords

Author Disambiguation;Same-Name Authors;E-mail

References

  1. N. Aswani, K. Bontcheva, and H. Cunningham, "Mining information for instance unification," ISWC-2006, pp.329-342, 2006.
  2. A. Culotta, P. Kanani, R. Hall, M. Wick, and A. McCallum, "Author disambiguation using error-driven machine learning with a ranking loss function," IIWeb-2007, 2007.
  3. R. Guha and A. Garg, "Disambiguating people in search," WWW-2004, 2004.
  4. J. Huang, S. Ertekin, and C. Giles, "Efficient name disambiguation for large scale databases," PKDD-2006, pp.536-544, 2006.
  5. P. Kanani, A. McCallum, and C. Pal, "Improving author coreference by resource-bounded information gathering from the Web," IJCAI-2007, 2007.
  6. D. Lee, B. On, J. Kang, and S. Park, "Effective and scalable solutions for mixed and split citation problems in digital libraries," IQIS-2005, pp.69-76, 2005. https://doi.org/10.1145/1077501.1077514
  7. Y. Song, J. Huang, I. Councill, J. Li, and C. Giles, "Efficient topic-based unsupervised name disambiguation," JCDL-2007, 2007. https://doi.org/10.1145/1255175.1255243
  8. V. Torvik, M. Weeber, D. Swanson, and N. Smalheiser, "A probabilistic similarity metric for Medline records: a model for author name disambiguation," J. of the American Society for Information Science and Technology, Vol.56, No.2, pp.140-158, 2005. https://doi.org/10.1002/asi.20105
  9. W. Winkler, "Overview of record linkage and current research directions," Research Report Series #2006-2, Statistical Research Division, U.S. Census Bureau., 2006.

Cited by

  1. A Comparative Study on Authority Records for Japanese Writers in Japan and the United States of America vol.48, pp.1, 2014, https://doi.org/10.4275/KSLIS.2014.48.1.149