Segment-Based Inverted Index for Querying Large XML Documents

대용량 XML 문서의 효율적인 질의 처리를 위한 세그먼트 기반 역 인덱스

  • 정병수 (경희대학교 전자정보학부) ;
  • 이혜자 (용인송담대학교 의료정보시스템과)
  • Published : 2008.09.30

Abstract

The existing XML storage methods which use relational data model, usually store path information for every node type including literal contents in order to keep the structural information of XML documents. Such path information is usually maintained by an inverted index to efficiently process XPath queries for large XML documents. In this study, We propose an improved approach that retrieve information from the large volume of XML documents stored in a relational database, while using a segment-based inverted index for path searches. Our new approach can reduce the number of searching an inverted index for getting target path information. We show the effectiveness of this approach through several experiments that compare XPath query performance with the existing methods.

Keywords

References

  1. 민경섭, 김형주, '상이한 구조의 XML 문서들에서 경로 질의 처리를 위한 RDBMS 기반역 인덱스 기법', 정보과학회논문지:데이터베이스, 제30권, 제4호(2003), pp.420-428
  2. 민준기, 박명제, 안재용, 정진완, '다양한 저장소에서의 효율적인 XML 저장기법에 대한 연구', 정보과학회 데이터베이스연구, 제19권, 제1호(2003), pp.1-14
  3. 박영호, 한욱신, 황규영, '정보 검색 기술을 이용한 대규모 이질적인 XML 문서에 대한 효율적인 선형 경로 질의 처리', 정보과학회 논문지:데이터베이스, 제31권, 제5호(2004), pp.540-552
  4. 배진욱, 문봉기, 이석호, '빠른 XML 질의 처리를 위한 세그먼트 조인 기법', 정보과학회논문지:데이터베이스, 제32권, 제3호(2005), pp.334-343
  5. 이혜자, 정병수, 김대호, 이영구, '경로정보의 중복을 제거한 XML 문서의 저장 및 질의처리 기법', 정보처리학회논문지D, 제12-D권, 제5호(2005), pp.663-672 https://doi.org/10.3745/KIPSTD.2005.12D.5.663
  6. M. G. Bauer, F. Ramsak, and R. Bayer, 'Multidimensional Mapping and Indexing', In Proceedings of BTW Conf., 2003
  7. N. Bruno, N. Koudas, and D. Srivastara, 'Holistic Twig Joins:Optimal XML Pattern Matching', In Proceedings of ACM SIGMOD Conf., 2002
  8. V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl, 'From Structured Documents and to Novel Query Facilities', In Proceedings of ACM SIGMOD Conf., 1994
  9. C. Chung, J. Min, and K. Shim, 'APEX: An Adaptive Path Index for XML Data', In Proceedings of ACM SIGMOD Conf., 2002.
  10. B. F. Cooper, N. Sample, M. J. Franklin, G. R. Hjaltason, and M. Shadmon, 'A Fast Index for Semi-structured Data', In Proceedings of VLDB Conf., 2001
  11. D. Florescu and D. Kossmann, 'Storing and Querying XML Data Using an RDBMS', IEEE Data Engineering Bulletin, Vol.22, No.3(1999), pp.27-34
  12. H. Jiang, H. Lu, W. Wang, and J. Yu, 'XParent:An Efficient RDBMS-Based XML Database System', In Proceedings of ICDE Conf., 2002
  13. H. Jiang, W. Wang, H. Lu, and J. X. Yu, 'Holistic Twig Joins on Indexed XML Documents', In Proceedings of VLDB Conf., 2003
  14. Q. Li and B. Moon, 'Indexing and Querying XML Data for Regular Path Expression', In Proceedings of VLDB Conf., 2001
  15. J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom, 'Lore:A Database Management System for Semistructured Data', SIGMOD Record, Vol.26, No.3(1997), pp.54-66
  16. S. Pal, I. Cseri, O. Seeliger, G. Schaller, L. Giakoumakis, and V. Zolotov, 'Indexing XML Data Stored in a Relational Database', In Proceedings of VLDB Conf., 2004
  17. J. Shanmugasundaram et al., 'Relational Databases for Querying XML Documents :Limitation and Opportunities', In Proceedings of VLDB Conf., 1999
  18. H. Schoning, 'Tamino-a DBMS Designed for XML', In Proceedings of IEEE ICDE Conf., 2001
  19. S. Sundara, Y. Hu, T. Chorma, and J. Srimivasan, 'Developing an Indexing Scheme XML Document Collections Using the Oracle8i Extensibility Framework', In Proceedings of VLDB Conf., 2001
  20. I. Tatarinov, S. D. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, and C. Zhang, 'Storing and Querying Ordered XML Using a Relational Database System', In Proceedings of ACM SIGMOD Conf., 2002
  21. F. Tian, D. J. Dewitt, J. Chen, and C. Zhang, 'The Design and Performance Evaluation of Alternative XML Storage Strategies', SIGMOD Record, Vol.31, No. 1(2002), pp.5-10
  22. XML Path Language (XPath) 2.0, http://www.w3c.org/TR/2003/WD-xpath20-200 31112
  23. M. Yoshikawa, et al., 'XRel:A Path- Based Approach to Storage and Retrieval of XML Documents using Relational Databases', ACM Transactions on Internet Technology, Vol.1, No.1(2001), pp.110 -141 https://doi.org/10.1145/383034.383038