DOI QR코드

DOI QR Code

A Tree-structured XPath Query Reduction Scheme for Enhancing XML Query Processing Performance

XML 질의의 수행성능 향상을 위한 트리 구조 XPath 질의의 축약 기법에 관한 연구

  • 이민수 (이화여자대학교 컴퓨터학과) ;
  • 김윤미 (이화여자대학교 컴퓨터학과) ;
  • 송수경 (이화여자대학교 컴퓨터학과)
  • Published : 2007.10.31

Abstract

XML data generally consists of a hierarchical tree-structure which is reflected in mechanisms to store and retrieve XML data. Therefore, when storing XML data in the database, the hierarchical relationships among the XML elements are taken into consideration during the restructuring and storing of the XML data. Also, in order to support the search queries from the user, a mechanism is needed to compute the hierarchical relationship between the element structures specified by the query. The structural join operation is one solution to this problem, and is an efficient computation method for hierarchical relationships in an in database based on the node numbering scheme. However, in order to process a tree structured XML query which contains a complex nested hierarchical relationship it still needs to carry out multiple structural joins and results in another problem of having a high query execution cost. Therefore, in this paper we provide a preprocessing mechanism for effectively reducing the cost of multiple nested structural joins by applying the concept of equivalence classes and suggest a query path reduction algorithm to shorten the path query which consists of a regular expression. The mechanism is especially devised to reduce path queries containing branch nodes. The experimental results show that the proposed algorithm can reduce the time requited for processing the path queries to 1/3 of the original execution time.

일반적으로 XML 데이터는 트리 형태의 계층적인 구조를 가지고 있으며, XML 데이터의 저장 및 검색도 이러한 특성을 반영한다. 따라서 XML 데이터를 데이터베이스화 할 때에 XML 엘리먼트 간의 이러한 계층 관계를 반영하여 XML 데이터를 구조화하여 저장하고, 사용자의 검색을 지원하기 위해서는 질의에 명세 된 엘리먼트 구조 간의 계층 관계를 계산하여 처리하는 방법이 필요하다. 구조적 조인(structural joins) 연산은 이 문제의 한 해결책으로서 노드 번호 매기기 방식(node numbering scheme)에 기반한 XML 데이터베이스에 대하여 효율적인 계층 관계 연산 기법을 제시하고 있다. 하지만 계층 관계가 복잡하게 중첩되어 있는 트리 구조의 XML 질의를 처리하려면 여전히 다수의 구조적 조인을 수행해야 하기 때문에 질의 처리 비용이 많이 드는 또 다른 문제를 갖게 된다. 이에 본 논문에서는 선행 연구에서 제안된 트리 구조의 XML 질의 처리시에 필요한 다수의 중첩된 구조적 조인들의 수행비용을 효과적으로 줄이기 위한 사전 처리 방법으로서 동등 클래스 개념을 적용한 정규 표현식(regular expression)으로 된 경로 질의(path query)의 길이를 단축하는 경로식 단축 알고리즘을 소개하며 특히 분기 노드(branch node)가 포함된 경로식 단축 알고리즘을 제안한다. 제안한 알고리즘이 XML 경로식 질의 처리 시간을 평균적으로 1/3로 단축할 수 있음을 실험을 통해서 확인한다.

Keywords

References

  1. Quanzhong Li and Bonki Moon. Indexing and querying XML data for regular path expressions. In Proc. of the 27th VLDB conference, Rome, Italy, Sep. 2001
  2. Chun Zhang, Jeffrey F. Naughton, Qiong Luo, and David J. DeWitt, and Guy M. Lohman. On supporting containment queries in relational database management systems. In Proc. of 2001 ACM-SIGMOD conference, Santa Barbara, CA, USA, May. 2001
  3. Divesh Srivastava, Shurug AI Khalifa, H.V. Jagadish, Nick Koudas, Jinesh M. Patel, and Yuquing Wu. Structural joins: A primitive for efficient XML query pattern matching. In Proc. of the 2002 IEEE conference on Data Engineering, San Jose, USA, Feb. 2002 https://doi.org/10.1109/ICDE.2002.994704
  4. Haifeng Jiang, Hongjun Lu, Wei Wang, and Beng Chin Ooi, XR-Tree: Indexing XML Data for Efficient Structural Joins. In Proc. of the 2003 IEEE conference on Data Engineering, page 253-263, Bangalore, India, March 2003
  5. Hanyu Li, Mong Li Lee, Wynne Hsu, and Chao Chen, An Evaluation of XML Indexes for Structural Join. SIGMOD Record 33(3), pages 28-33, 2004 https://doi.org/10.1145/1031570.1031576
  6. Yuqing Wu, Jignesh M. Pastel, and H.V. Jagadish, Structural Join Order Selection for XML Query Optimization. In Proc. of the 2003 IEEE conference on Data Engineering, page 443-454, Athens, Bangalore, India, March 2003
  7. Shu-Yao Chien, Zografoula Vagena, Donghui Zhang, Vassilis J. Tsotras, Carlo Zaniolo, Efficient Structural Joins on Indexed XML Documents, Proceedings of the 28th VLDB Conference, Hong Kong, China, 2002
  8. H. Jiang, H. Lu, W. Wang, and B. C. Ooi, XR-Tree: Indexing XML Data for Efficient Structural Join, Proc. of ICDE, India, 2003
  9. Bingsheng He, Qiong Luo, Byron Choi, Adaptive Index Utilization in Memory Resident Structural Joins, IEEE Transactions on Knowledge and Data Engineering, Vol. 19, No. 6, pp.772-788, June 2007 https://doi.org/10.1109/TKDE.2007.190616
  10. Pierre Geneves, Jean-Yves Vion-Dury, Logic-based XPath optimization, Document Engineering Proceedings of the 2004 ACM symposium on Document engineering Pages 211-219, 2004 https://doi.org/10.1145/1030397.1030437
  11. April Kwong, Michael Gertz, Schema based Optimization of XPath Expressions, Technical report, Univ. of California dept. of Computer Science, 2001
  12. C. Y. Chan, P. Felber, M. Garofalakis, R. Rastogi, Efficient Filtering of XML Documents with XPath Expressions, The VLDB Journal, Vol.11, pp 354-379, Dec. 2002 https://doi.org/10.1007/s00778-002-0077-6
  13. Sihem Amer-Yahia, SungRan Cho, Laks V. S. Lakshmanan, Divesh Srivastava, Tree Pattern Query Minimization, The VLDB Journal No.11 page 315-331, 2002 https://doi.org/10.1007/s00778-002-0076-7
  14. Roy Goldman and Jennifer Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In Proc. of the 23rd VLDB conference, pages 436-445, Athens, Greece, Aug. 1997
  15. Hyoseop Shin, Minsoo Lee, An Efficient Branch Query Rewriting Algorithm for XML Query Optimization, 4th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2005), LNCS 3761, Springer-Verlag, pp. 1629-1639, Agia Napa, Cyprus, Oct. 31 - Nov. 4, 2005 https://doi.org/10.1007/11575801
  16. Albrecht Schmidt, Florian Waas, Martin L. Kersten, Michael J Carey, Ioana Manolescu, Ralph Busse. XMark: A Benchmark for XML Data Management. In Proc. of the 28th VLDB conference, page 974-985, Hong Kong, China, Aug. 2002
  17. Sleepycat Software Inc., http://www.sleepycat.com