An XML Schema-based Semantic Data Integration

XML Schema기반 시맨틱 데이타 통합

  • 김동광 (건국대학교 컴퓨터공학과) ;
  • 정갑주 (건국대학교 인터넷멀티미디어 공학부) ;
  • 신효섭 (건국대학교 인터넷미디어공학부) ;
  • 황선태 (국민대학교 컴퓨터학부)
  • Published : 2006.09.01

Abstract

Cyber-infrastructures for scientific and engineering applications require integrating heterogeneous legacy data in different formats and from various domains. Such data integration raises challenging issues: (1) Support for multiple independently-managed schemas, (2) Ease of schema evolution, and (3) Simple schema mappings. In order to address these issues, we propose a novel approach to semantic integration of scientific data which uses XML schemas and RDF-based schema mappings. In this approach, XML schema al-lows scientists to manage data models intuitively and to use commodity XML DBMS tools. A simple RDF-based ontological representation scheme is used for only structural relations among independently-managed XML schemas from different institutes or domains We present the design and implementation of a prototype system developed for the national cyber-environments for civil engi-neering research activities in Korea (similar to the NEES project in USA) which is called KOCEDgrid (http://www.koced.net).

과학 공학 분야의 사이버 인프라스트럭쳐는 다양한 도메인에서 수행되는 연구 활동을 통해서 얻어지는 다양한 형식의 데이타들뿐만 아니라 이런 데이타를 저장 관리하기 위한 이질적인 저장소들의 통합이 요구되고 있다. 데이타 통합 작업의 어려움은 다음과 같다: (1) 시스템 독립적인 다중 데이타 스키마 지원, (2) 다양하게 변화하는 스키마들의 쉬운 관리, (3) 직관적인 스키마 맵핑. 이 같은 문제를 해결하기 위해서, 우리는 XML Schema를 이용해서 과학 분야의 데이타 모델을 정의하고 RDF기반의 스키마 맵핑을 이용해서 의미적으로 통합할 수 있는 새로운 방법을 제안한다. XML Schema기반의 데이타 모델 정의 방법은 실험 데이타들을 과학자들이 직관적이고 간편하게 표현 할 수 있게 해주며, 이 데이타 모델은 많은 시스템에서 사용중인 XML DBMS를 그대로 이용할 수 있는 장점이 있다. 또한, 스키마 맵핑을 위해서 RDF로 구축된 온톨로지를 이용해서 XML Schema로 정의되어 있는 스키마의 구조적인 관계를 정의하고, 맵핑 정보를 이용해서 통합 질의를 수행한다. 우리는 제안 시스템의 프로토타입을 토목 공학 분야 프로젝트인 KOCED에 적용하였다.

Keywords

References

  1. Mark Ellisrnan and Steve Peltier: Medical Data Federation: The Biomedical Informatics Research Networks, The Grid 2 Second Edition, Pages: 109-120, 2004
  2. Korea Construction Engineering Development Collaboration (KOCED), www.koced.net
  3. Jun Peng, Kincho H. Law: Reference NEESgrid Data Model [TR-2004-40] (2004)
  4. Prentice-Hall, NJ : Database Design Using Entities and Relationships, Chen, P. P., S. B. Yao (ed.), Principles of Data Base Design, 1985, pp, 174-210
  5. J. Arlow and I. Neustadt. UML and the Unified Process: Practical Object-Oriented Analysis and Design, Addison-Wesley Pub Co., Boston, MA, 2001
  6. Tim Bray and C.M. Sperberg-McQueen, 'Extensible Markup Language (XML): Part I. Syntax,' World Wide Web Consortium Recommendations, February 1998, Available at http://www.w3.org/TR/REC-xml
  7. M. Fernandez, W.-C. Tan, and D. Suciu, Silk-Route: Trading between relations and XML. In Ninth International World Wide Web Conference, November 1999
  8. David C. Fallside, 'XML Schema Part 0: Primer,' World Wide Web Consortium Candidate Recommendation, October 2000, Available at http://www.w3.org/TR/xmlschema-0/
  9. S.Boag, D.Chamberhn, M.F.Fernandez : XQuery 1.0: An XML query Language, 30 Aprial 2002, http://www.w3.org/TR/xquery
  10. Z. G. Ives, A. Y. Halevy, and D. S. Weld. An XML query engine for network-bound data. VLDB Journal, 11(4):380-402, December 2002 https://doi.org/10.1007/s00778-002-0078-5
  11. T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, May 2001
  12. P. Patel-Schneider and I. Simeon. Building the Semantic Web on XML. In Int'l Semantic Web Conference '02, June 2002
  13. Andy Seaborne, HP Labs Bristol : RDQL - A Query Language for RDF, 9 January 2004, http://www.w3.org/Submission/RDQL/
  14. A. Doan, J. Madhavan, P. Domingos, and A. Halevy, Learning to map between ontologies on the semantic web. In Eleventh International World Wide Web Conference, 2002 https://doi.org/10.1145/511446.511532
  15. Dan Brickley and R. V. Guha.: W3C Resource Description Framework(RDF) Schema Specification, http://www.w3.org/TR/1998/WD-rdf-schema/, March 2000. W3C Candidate Recommendation
  16. Mario Antonioletti, Malcolm Atkinson, Rob Baxter, Andrew Borley: The design and implementation of Grid database services in OGSA-DAI(Database Access and Integration Services), Concurrency and Computation: Practice & Experience archive Volume 17, Issue 2-4, Pages: 357-376 (2005) https://doi.org/10.1002/cpe.v17:2/4
  17. NEESgrid, http://it.nees.org
  18. GEONgrid, http://www.geongrid.org
  19. Wolfgang Nejdl, Boris Wolf, Changtao Qu : EDUTELLA: A P2P Networking Infrastructure Based on RDF (2002), May 7-11, 2002, WWW 2002
  20. Zachary G. Ives, Alon Y. Halevy, Peter Mork : Piazza: Mediation and Integration Infrastructure for Semantic Web Data, Journal of Web Semantics manuscript https://doi.org/10.1016/j.websem.2003.11.003
  21. Phd thesis, University of Bremen. in German : Semantic Mediation for heterogeneous Information Sources. (2003)
  22. J. Broekstra, A. Kampan, and F. van Harmelen. Sesame : A generic architecture for storing and querying RDF and RDF Schema., In Int'l Semantic Web Conference '02, pages 54-68, 2002
  23. T. R. Gruber. A translation approach to portable ontology specifications. Knowledge Acquisition, 5:199-220, 1993 https://doi.org/10.1006/knac.1993.1008
  24. B. Amann, C. Beeri, I. Fundulaki, and M. Scholl. Ontology-based integration of XML web resources. In Int'l Semantic Web Conference '02, pages 117-131, 2002
  25. E. Mena, V. Kashyap, A. P. Sheth, and A. Illarramendi. OBSERVER: An approach for query processing in global information systems based on interoperation across pre-existing ontologies. Distri-buted and Parallel Databases, 8(2):223-271, 2000 https://doi.org/10.1023/A:1008741824956
  26. A. Y. Levy, A. Rajaraman, and J. J. Ordille. Que-rying heterogeneous information sources using source descriptions. In VLDB '96, pages 251-262, 1996
  27. A. P. Sheth and J. A. Larson. Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys, 22(3):183-236, 1990 https://doi.org/10.1145/96602.96604
  28. D. D. Roure, I. Foster, E. Miller, J. Hendler, and C. Goble. The semantic grid: The grid meets the semantic web. Panel at the WWWConference, Honolulu, Hawaii, 2002
  29. Yannis Kalfoglou and Marco Schorlemmer: Ontology mapping: the state of the art, The Knowledge Engineering Review, Volume 18, Issue 1, Pages: 1-31 (2003) https://doi.org/10.1017/S0269888903000651
  30. Avi Silberschatz, Henry F. Korth, S. Sudarshan: Database System Concepts Fifth Edition, ISBN 0-07-295886-3
  31. Bachler, M., Buckingham-Shum, S., Chen-Burger, J., Dalton, J., Roure, D. D., Eisenstadt, M., Frey, J., Komzak, J., Michaelides, D., Page, K., Potter, S., Shadbolt, N. and Tate, A., Chain ReAKTing: Collaborative Advanced Knowledge Technologies in the CombeChem Grid. in UK e-Science All Hands Meeting, (Nottingham, UK, 2004)
  32. Hughes, G., Mills, H., de Roure, D., Frey, J., Moreau, L., schraefel, m. c., Smith, G. and Zaluska, E: The semantic smart laboratory: a system for supporting the chemical eScientist. Organic and Biomolecular Chemistry 2:pp. 1-10. (2004) https://doi.org/10.1039/b312466e
  33. Bellahsene Zohra, Milo Tova, Rys Michael, Suciu Dan, Unland Rainer: Database And XML Technologies , Second International Xml Database Symposium, Xsyrn 2004, Toronto, Canada, August 29-30, 2004, Proceedings