DOI QR코드

DOI QR Code

Efficient Processing of Continuous Join Queries between a Data Stream and Multiple Relations for Real-Time Analysis of E-Commerce Data

전자상거래 데이터의 실시간 분석을 위한 데이터 스트림과 다수 릴레이션 간의 효율적인 연속 조인 처리 기법

  • Kim, Haeri (Division of Computer Science, Sookmyung Women's University) ;
  • Lee, Ki Yong (Division of Computer Science, Sookmyung Women's University)
  • Received : 2013.07.17
  • Accepted : 2013.08.09
  • Published : 2013.08.31

Abstract

Recently, as real-time availability of e-commerce data becomes possible, the requirement of real-time analysis of e-commerce increases significantly. In the real-time analysis of e-commerce data, it is very important to efficiently process continuous join queries between an e-commerce data stream and disk-based large relations. In this paper, we propose an efficient method for processing a continuous join query between an e-commerce data stream and multiple disk-based relations. The proposed method improves the service rate significantly, while reducing the amount of required memory substantially. Through analysis and various experiments, we show the efficiency of the proposed method compared with the previous one in terms of service rate and memory usage.

최근 들어 전자상거래 데이터의 실시간 공급이 가능해지면서, 전자상거래 데이터를 실시간으로 분석하고자 하는 요구가 급증하고 있다. 이를 위해서는 전자상거래 데이터 스트림과 디스크에 저장된 대규모 릴레이션 간의 연속 조인 질의를 효율적으로 처리하는 것이 매우 중요하다. 본 논문에서는 전자상거래 데이터 스트림과 디스크에 저장된 다수 릴레이션 간의 효율적인 연속 조인 질의 기법을 제안한다. 제안 방법은 기존 방법에 비해 서비스율을 크게 향상시키는 한편, 메모리 사용량을 크게 줄인다. 분석과 다양한 실험을 통해, 제안 방법은 기존 방법에 비해 서비스율과 메모리 사용량에서 더 효율적임을 보인다.

Keywords

References

  1. Babcock, B., Babu, S., Datar, M., Motwani, R., and Widom, J., "Processing sliding window multi-joins in continuous queries over data streams," In Proc. ACM SIGMOD- SIGACTSIGART Symposium on Principles of Database Systems (PODS), Madison, Wisconsin, USA, pp. 1-16, June 2002.
  2. Garcia-Molina, H., Ullman, J. D., and Widom, J., DATABASE SYSTEMS : The complete Book : International Edition, 2/E. pp. 718-745, 2009.
  3. Golab, L. and Ozsu, T., "Processing sliding window multijoins in continuous queries over data streams," In Proc. Int. Conf. on Very Large Databases (VLDB), Berlin, Germany, pp. 500-511, September 2003.
  4. Kang, J., Naughton, J. F., and Viglas, S., "Evaluating window joins over unbounded streams," In Proc. Int. Conf. on Data Engineering, Bangalore, India, pp. 341-352, March, 2003.
  5. Karakasidis, A. and Hellas, I., "ETL queues for active data warehousing," In Proc. Int. Workshop on Information Quality in Informational Systems (IQIS), pp. 28-39, 2005.
  6. Lee, Y. W., Lee, K. Y., and Kim, M. H., "Multiple Continuous Skyline Query Processing over Data Streams," The Journal of Society for e-Business Studies, Vol. 15, No. 4, pp. 165-180, November 2010.
  7. Naeem, M. A., Dobbie, G., and Weber, G., "Optimised X-HYBRIDJOIN for Near- Real-Time Data Warehousing," In Proc. 23rd Australasian Database Conference, pp. 21-30, 2012.
  8. Naeem, M. A., Dobbie, G., and Weber, G., "X-HYBRIDJOIN for Near-Real-Time Data Warehousing," In Proc. 28th British National Conference on Databases, pp. 33-47, 2011.
  9. Naeem, M. A., Dobbie, G., Weber, G., and Alam, S., "R-MESHJOIN for Near-realtime Data Warehousing," In Proc. the ACM 13th International Workshop on Data Warehousing and OLAP, pp. 53-60, 2010.
  10. Polyzotis, N., Skiadopoulos, S., Vassiliadis, P., Simitsis, A., and Frantzell, N., "Meshing Streaming Updates with Persistent Data in an Active Data Warehouse," IEEE Trans. on Knowl. And Data Eng., Vol. 20, No. 7, pp. 976-911, 2008. https://doi.org/10.1109/TKDE.2008.27
  11. Polyzotis, N., Skiadopoulos, S., Vassiliadis, P., Simitsis, and A., Frantzell, N., "Supporting Streaming Updates in an Active Data Warehouse," In Proc. IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, pp. 476-485, 2007.
  12. Viglas, S. D., Naughton, J. F., and Burger, J., "Maximizing the output rate of multiway join queries over streaming information sources," In Proc. Int. Conf. on Very Large Databases (VLDB), Berlin, Germany, pp. 285-296, September, 2003.
  13. White, C., "Intelligent business strategies: Real-time data warehousing heats up," DM Review, 2002.