DOI QR코드

DOI QR Code

Author Entity Identification using Representative Properties in Linked Data

대표 속성을 이용한 저자 개체 식별

  • 김태홍 (과학기술연합대학원대학교) ;
  • 정한민 (한국과학기술정보연구원) ;
  • 성원경 (한국과학기술정보연구원) ;
  • 김평 (한국과학기술정보연구원)
  • Received : 2011.12.13
  • Accepted : 2011.12.29
  • Published : 2012.01.28

Abstract

In recent years, Linked Data that is published under an open license shows increased growth rate and comes into the spotlight due to its interoperability and openness especially in government of developed countries. However there are relatively few out-links compared with its entire number of links and most of links refer a few hub dataset. These occur because of absence of technology that identifies entities in Linked data. In this paper, we present an improved author entity resolution method that using representative properties. To solve problems of previous methods that utilizes relation with other entities(owl:sameAs, owl:differentFrom and so on) or depends on Curation, we design and evaluate an automated realtime resolution process based on multi-ontologies that respects entity's type and its logical characteristics so as to verify entities consistency. The evaluation of author entity resolution shows positive results (The average of K measuring result is 0.8533.) with 29 author information that has obtained confirmation.

Keywords

Linked Data;Author Identification;Entity Resolution;OntoURIResolver

References

  1. 이정아, "스마트 정부의 공공정보 개방과 이용활 성화 전략", CIO report, 제28권, 2010.
  2. Pyung Kim, S. W. Lee, and B. J. You, "A same As Management Method based on URI," 한국콘텐츠학회, KISTI-KOCON ICCC 2009, 제7권, 제2호, 2009.
  3. http://www.w3.org/wiki/TaskForces/Community Projects/LinkingOpenData/DataSets/LinkStatistics
  4. C. Bizer and R. Cyganiak, "D2R Server- Publishing Relational Databases on the Semantic Web," Proceedings of the 5th International Semantic Web Conference, 2006.
  5. H. Glaser, A. Jaffri, and T. Millard, "Managing Co-reference on the Semantic Web," Proceedings of WWW2009 Workshop: Linked Data on the Web, 2009.
  6. P. Bouquet, H. Stoermer, and D. Giacomuzzi, "OKKAM: Enabling a Web of Entities, Identity, Identifiers, Identification," Proceedings of WWW2007 Workshop on Entity-Centric Approaches to Information and Knowledge Management on the Web, 2007.
  7. E. Ioannou, "Intelligent Entity Matching and Ranking," OKKAM report, D3.1, 2010.
  8. C. Laibe, "Identifiers.org and MIRIAM Registry: Perennial Identifiers for Cross-referencing Purposes," Proceedings of Nature, 2011.
  9. W. E. Winkler,"String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage," Survey Research Methods of American Statistical Association, 2009.
  10. 김태홍, 김 평, 정한민, 성원경, "다중 온톨로지의 속성정보를 이용한 점진적 개체 식별", 한국인터넷정보학회, 2011년도 하계학술발표대회, 2011.
  11. A. McCallum, K. Nigam, and L. Ungar, "Efficient clustering of high-dimensional data sets with application to reference matching," Proceedings of KDD, pp.169-178, 2000.
  12. A. Laender, M. Goncalves, R.Cota, A. Ferreira, R. Santos, and A. Silva, "Keeping a digital library clean: new solutions to old problems," ACM Symposium on Document Engineering, pp.257-262, 2008.
  13. R. Cota, M. Goncalves, and A. Laender, "A heuristic-based hierarchical clustering method for author name disambiguation in digital libraries," Proceedings of Brazilian Symposium on Databases, pp.20-34, 2007.
  14. A. Solomonoff, A. Mielke, M. Schmidt, and H. Gish, "Clustering speakers by their voices," IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.757-760, 1998.
  15. M. David, E. Steven, and G. Hector, "Evaluating entity resolution results," Proceedings of the VLDB Endowment, Vol.3, No.1-2, pp.208-219, 2010. https://doi.org/10.14778/1920841.1920871