DOI QR코드

DOI QR Code

Selectivity Estimation Using Compressed Spatial Histogram

압축된 공간 히스토그램을 이용한 선택율 추정 기법

  • 지정희 (충북대학교 대학원 전자계산학) ;
  • 이진열 (포인트 아이) ;
  • 김상호 (충북대학교 대학원 전자계산학) ;
  • 류근호 (충북대학교 전기전자 및 컴퓨터공학부)
  • Published : 2004.04.01

Abstract

Selectivity estimation for spatial query is very important process used in finding the most efficient execution plan. Many works have been performed to estimate accurate selectivity. Although they deal with some problems such as false-count, multi-count, they can not get such effects in little memory space. Therefore, we propose a new technique called MW Histogram which is able to compress summary data and get reasonable results and has a flexible structure to react dynamic update. Our method is based on two techniques : (a) MinSkew partitioning algorithm which deal with skewed spatial datasets efficiently (b) Wavelet transformation which compression effect is proven. The experimental results showed that the MW Histogram which the buckets and wavelet coefficients ratio is 0.3 is lower relative error than MinSkew Histogram about 5%-20% queries, demonstrates that MW histogram gets a good selectivity in little memory.

공간 질의에 대한 선택율 추정은 가장 효율적인 실행 계획을 찾는데 이용되는 매우 중요한 과정이다. 공간 도메인이 큰 경우, 기존 연구의 요약정보는 상대적으로 적은 정보로 선택율을 추정하기 때문에 좋은 선택율을 유지하기 어렵다. 따라서, 이 논문에서는 작은 저장공간에 공간요약정보를 압축하는 새로운 기법인 MW 히스토그램을 제안한다. 이 히스토그램은 MinSkew 분할 알고리즘과 웨이블릿 변환이 결합되어 적은 저장공간에서도 타당한 선택율과 압축효과를 얻을 수 있고, 동적 갱신에 대해 효율적으로 대처할 수 있는 구조를 가진다. 실험 결과를 통하여, 버켓 수가 0.3M/6인 MW 히스토그램이 5%-20% 질의에서 평균적으로 좋은 성능을 보이고 있어, MW 히스토그램이 적은 저장공간에서 더 좋은 선택율을 얻을 수 있음을 확인시켜주었다.

Keywords

References

  1. 조문증, '데이터베이스 시스템에서 웨이블릿 변환에 기반한 통합 요약정보의 관리', 전자전산학과 전산학전공, 한국과학기술원 박사논문, 2001
  2. 엄정옥, 조숙경, 배해영, '시간적 제약을 갖는 공간 질의 처리를 위한 실시간 연산 후배치 기법', 정보과학회논문지 : 컴퓨팅의 실재, 제7권 제3호, pp.l93-21O, June, 2001
  3. 문현수, 황환규, '공간 영역 질의의 선택율 추정을 위한 향상된 면적 균등 분할 방법', Journal of Telecommunications and Information, Vol. 4, 2000
  4. 정지훈, 홍석진, 배진욱, 안성준, 송병호, 이석호, '다차원 히스토그램에서 범위 질의의 선택도에 대한 오차 추정', 정보과학회 2001년 추계학술대회, Vo1.28, No.2, pp.211-213
  5. 김홍연, 배해영, 다차원 히스토그램을 이용한 공간 위상 술어의 선택도 추정 기법, 정보처리논문지, 제6권 제4호,pp.841 850, April, 1999
  6. Poosala et al., 'Improved Histograms for Selectivity Estimation of Range Predicates' In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp.294-305, 1996 https://doi.org/10.1145/233269.233342
  7. Yossi Matias, Jeffrey Scott Vitter, Min Wang,' Wavelet-Based Histograms for Selectivity Estimation,' In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp.448-459, 1998 https://doi.org/10.1145/276305.276344
  8. Swarup Acharya, Viswanath Poosala, Sridhar Ramaswamy, 'Selectivity estimation in spatial databases' In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp.13-24,1999 https://doi.org/10.1145/304182.304184
  9. Vitter, Wang, 'Approximate Computation of Multidimensional Aggregates of Sparse Data using Wavelets' In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 193-204, 1999 https://doi.org/10.1145/304182.304199
  10. A. Aboulnaga, J. Naughton, 'Accurate estimation of the cost of spatial selections' In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp.123-134, 2000 https://doi.org/10.1109/ICDE.2000.839399
  11. Yossi Matias, Jeffrey Scott Vitter, Min Wang, 'Dynamic Maintenance of Wavelet-Based Histograms,' The VLDB Journal, pp.101-110, 2000
  12. L. Getoor, B. Taskar, D. Roller, 'Selectivity estimation using probabilistic models,' In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2001 https://doi.org/10.1145/375663.375727
  13. Nikos Mamoulis, Dimitris Papadias, 'Selectivity estimation of complex spatial queries,' In Proc. Int. Symp. on Spatial and Temporal Databases, pp.156-174, 2001
  14. Min Wang, Jeffrey Scott Vitter, Lipyeow Lim, Sriram Padmanabhan, 'Wavelet-based cost Estimation for Spatial Queries,' In Proc. Int. Symp. on Spatial and Temporal Databases, pp.175-196, 2001
  15. Ning An, Zhen-Yu Yang, Sivasubramaniam, A., 'Selectivity estimation for spatial joins,' In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp.368-375, 2001 https://doi.org/10.1109/ICDE.2001.914849
  16. C. Sun, D. Agrawal, A. El Abbadi, 'Selectivity for spatial joins with geometric selections,' Proc. of EDBT, pp.609-626, 2002
  17. Yong-Jin Choi, Chin-Wan Chung, 'Selectivity estimation for spatio-temporal queries to moving objects,' In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 440-451, 2002 https://doi.org/10.1145/564691.564742
  18. Tao, Y., Sun, J., Papadias, D., 'Selectivity Estimation for Predictive Spatio-Temporal Queries' In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp.417-428. 2003
  19. Sun, C, Agrawal, D., El Abbadi, A., 'Exploring spatial datasets with histograms (full version),' Technical Report, Computer Science Department, University of California, santa Barbara, 2001
  20. Antonios Deligiannakis, Nick Roussopoulos., 'Extended Wavelets for Multiple Measures,' ACM SIGMOD 2003, pp. 229-240, June, 2003 https://doi.org/10.1145/872757.872786
  21. Kaushik C, Minos G., Rajeev R., Kyuseok S., 'Approximate query processing using wavelets,' The VLDB Journal, pp. 199-223, 2001 https://doi.org/10.1007/s007780100049
  22. Minos G., Phillip B.G ., 'Wavelet Synopses with Error Guarantees,' ACM SIGMOD, Jine 4-5, Madison, Wisconsin, USA, 2002 https://doi.org/10.1145/564691.564746
  23. Yannis E. Ioannidis, 'Query Optimization,' ACM survey, 1996
  24. E. Clementini and P. Di Felice, 'A Comparison of Methods for Representing Topological Relationships,' Information Sciences 3, pp.149-178, 1995 https://doi.org/10.1016/1069-0115(94)00033-X
  25. Jin, N. An, A. Sivasubramaniam, 'Analyzing Range Queries on Spatial Data,' In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp.525-534, 2000 https://doi.org/10.1109/ICDE.2000.839451
  26. S. Muthukrishnan, Viswanath Poosala, Torsten Suel, 'On Rectangular Partitionings in Two Dimensions : Algorithms, Complexity, and Applications,' 7th International Conference on Database Theory, ICDT'99, 1999
  27. Jin Yul Lee, Jeong Hee Chi, Keun Ho Ryu, 'Spatial Selectivity Estimation Using Wavelet,' Proceddings of the 4th International Symposium on Advanced Intelligent Systems, ISSN 1738-0073, ISIS2003, pp.459-462, Sepmtember, 2003
  28. Jeong Hee Chi, Jin Yul Lee and Keun Ho Ryu, 'Selectivity Estimation for Spatial Databases,' Asian Conference on Remote Sensing & International Symposium on Remote Sensing (ISRS), November, 2003