DOI QR코드

DOI QR Code

데이터 스트림 상에서 다중 연속 질의 처리를 위한 속성기반 접근 기법

Attribute-based Approach for Multiple Continuous Queries over Data Streams

  • 이현호 (연세대학교 컴퓨터과학과) ;
  • 이원석 (연세대학교 컴퓨터과학과)
  • 발행 : 2007.08.31

초록

데이터 스트림은 빠르게 연속적으로 발생하는 무제한의 데이터 튜플의 집합이다. 이러한 데이터 스트림에 대한 질의 처리 또한 연속적이고 신속해야 하며 엄격한 시공간적 제약이 요구된다. 대부분의 데이터 스트림 관리시스템(DSMS)에서는 시공간적 제약사항을 효과적으로 지키기 위해서 등록된 연속 질의들의 선택 조건(selection predicate)들을 그룹화하거나 색인처리 한다. 본 논문에서는 연속 질의들의 선택 조건들을 속성별로 그룹화한 새로운 구조체인 속성 선택체(Attribute Selection Construct)를 제안한다. 속성 선택체에는 해당 속성이 특정 질의조건에 사용되는지 여부, 부분적으로 미리 계산된 질의결과 정보, 그리고 해당 속성의 선택률 통계 등 효율적인 질의 처리를 위한 유용한 정보들이 포함된다. 또한, 대상 질의집합을 구현한 속성 선택체들 간의 처리 순서는 전체적인 질의성능에 많은 영향을 미칠 수 있기 때문에 효과적으로 속성 선택체 처리 순서를 결정할 수 있는 전략도 함께 제안된다. 마지막으로, 기존의 방법들이 포함된 다양한 실험을 통하여 제안된 방법론의 성능을 여러 각도에서 비교 검증한다.

A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Query processing for such a data stream should also be continuous and rapid, which requires strict time and space constraints. In most DSMS(Data Stream Management System), the selection predicates of continuous queries are grouped or indexed to guarantee these constraints. This paper proposes a new scheme tailed an ASC(Attribute Selection Construct) that collectively evaluates selection predicates containing the same attribute in multiple continuous queries. An ASC contains valuable information, such as attribute usage status, partially pre calculated matching results and selectivity statistics for its multiple selection predicates. The processing order of those ASC's that are corresponding to the attributes of a base data stream can significantly influence the overall performance of multiple query evaluation. Consequently, a method of establishing an efficient evaluation order of multiple ASC's is also proposed. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

키워드

참고문헌

  1. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, 'Models and issues in data stream systems,' In Proc. 2002 ACM Symp. on Principles of Database Systems, pp.1-16, June, 2002 https://doi.org/10.1145/543613.543615
  2. D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik, 'Monitoring Streams-Anew class of data management applications,' In Proc of the 28th International Conference on VLDB, pp.215-226, August, 2002
  3. S. Chandrasekaran et al, 'TelegraphCQ: Continuous dataflow processing for an uncertain world,' In Proc. First Biennial Conf. on Innovative Data Systems Research, pp.269-280, Jan, 2003
  4. J. Chen, D. J. DeWitt, F. Tian and Y. Wang, 'NiagaraCQ: A Scalable Continuous Query System for Internet Databases,' In proc.of ACM SIGMOD 2000 Conf., pp.379-390, May, 2000 https://doi.org/10.1145/342009.335432
  5. D. Abadi, D. Carney, U. Cetintemel,M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, S. Zdonik, 'Aurora: A New Model and Architecture for Data Stream Management,' In VLDB Journal, pp.120-139, August, 2003 https://doi.org/10.1007/s00778-003-0095-z
  6. Widom J, Babu S, 'Continuous queries over data streams,' In ACM SIGMOD Record, pp.109-120, September, 2001 https://doi.org/10.1145/603867.603884
  7. Samuel R. Madden, Mehul A. Shah, Joseph M. Hellerstein and Vijayshankar Raman, 'Continuously Adaptive Continuous Queries over Streams,' In proc. 2002 ACM SIGMOD Coni; June, 2002 https://doi.org/10.1145/564691.564698
  8. Motwani R, Widom J, Arasu A, Babcock B, Babu S, Datar M, Manku G, Olston C, Rosenstein J, Varma R, 'Query Processing, Resource Management, and Approximation in a Data Stream Management System,' In Proc. of the 2003 CIDR, january, 2003
  9. Sirish Chandrasekaran and Michael J. Franklin, 'Streaming Queries over Streaming Data,' In Proc. of the 28th Intl. Conf. on Very Large Data Bases, August, 2002
  10. D. Carney, U. Cetinternel, A. Rasin, S. Zelonik, M. Chemiack, M. Stonebraker, 'Operator Scheduling in a Data Stream Manager,' In proc of the 29th International Conference on Very Large DataBases, 2003
  11. N. Tatbul, U. Cetintemel, S. Zdonik, M. Cherniack, M. Stonebraker, 'Load Shedding in a Data Stream Manager,' In proc of the 29th International Conference on Very Large Data Bases, pp.309-320, 2003
  12. Jianjun Chen, David J. DeWitt, and Jeffrey F. Naughton, 'Design and Evaluation of Alternative Selection Placement Strategies in Optimizing Continuous Queries,' In proc of the 18th International Conference on ICDE, pp.345-356, Feb, 2002
  13. Jianjun Chen, DavidJ. DeWitt, 'Dynamic Re-grouping of Continuous Queries,' In Proc of the 28th VLDB Conference, pp.430-441, 2002
  14. Ron Avnur, Joe Hellerstein, 'Eddies: Continuously Adaptive Query Processing,' In proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, Dallas, pp. 261-272, 2000 https://doi.org/10.1145/342009.335420
  15. Vijayshankar Raman, Amol Deshpande, and Joseph M. Hellerstein, 'Using State Modules for Adaptive Query Processing' In ICDE, 2003 https://doi.org/10.1109/ICDE.2003.1260805
  16. S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom, 'Adaptive Ordering of Pipelined Stream Filters,' In SIGMOD, pp.407-418, June, 2004 https://doi.org/10.1145/1007568.1007615
  17. K. Munagala, U. Shrivastava, and J. Widom, 'Optimization of Continuous Queries with Shared Expensive Filters,' In Proc of the 32th VLDB Conference, Sep, 2006 https://doi.org/10.1145/1265530.1265561
  18. H.S. Lim, J,G. Lee, M.J, Lee, K.Y. Whang, I.Y. Song, 'Continuous Query Processing in Data Streams Using Duality of data and Queries,' In Proc of the 32th VLDB Conference, Sep, 2006 https://doi.org/10.1145/1142473.1142509
  19. S. Babu and J. Widom, 'StreaMon: An Adaptive Engine for Stream Query Processing,' In proc.of ACM SIGMOD 2004 Conf., June, 2004 https://doi.org/10.1145/1007568.1007702