In-Memory Based Incremental Processing Method for Stream Query Processing in Big Data Environments

빅데이터 환경에서 스트림 질의 처리를 위한 인메모리 기반 점진적 처리 기법

  • 복경수 (충북대학교 정보통신공학과) ;
  • 육미선 (충북대학교 정보통신공학과) ;
  • 노연우 (충북대학교 정보통신공학과) ;
  • 한지은 (충북대학교 정보통신공학과) ;
  • 김연우 (충북대학교 정보통신공학과) ;
  • 임종태 (충북대학교 정보통신공학과) ;
  • 유재수 (충북대학교 정보통신공학과)
  • Received : 2015.11.15
  • Accepted : 2015.12.08
  • Published : 2016.02.28


Recently, massive amounts of stream data have been studied for distributed processing. In this paper, we propose an incremental stream data processing method based on in-memory in big data environments. The proposed method stores input data in a temporary queue and compare them with data in a master node. If the data is in the master node, the proposed method reuses the previous processing results located in the node chosen by the master node. If there are no previous results of data in the node, the proposed method processes the data and stores the result in a separate node. We also propose a job scheduling technique considering the load and performance of a node. In order to show the superiority of the proposed method, we compare it with the existing method in terms of query processing time. Our experimental results show that our method outperforms the existing method in terms of query processing time.


Big Data;In-memory;Distribute Processing;Real-time Processing;Streaming Data


Supported by : 정보통신기술진흥센터, 정보통신기술진흥센터, 한국연구재단, 한국연구재단


  1. J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Proc. conference on Symposium on Operating Systems Design & Implementation, pp.137-150, 2004.
  2. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica, "Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing," NSDI pp.15-28, 2012.
  4. D. Tiwari and Y. Soligin, "MapReusing Computation in an In-Memory MapReduce System," Proc. International Parallel and Distributed Processing Symposium, pp.61-71, 2014.
  5. Pramod Bhatotia, Marcel Dischinge, Rodrigo Rodrigues, and Umut A Acar, "Slider: Incremental Sliding-Window Computations for Large-Scale Data Analysis," Middleware, pp.61-72, 2014.
  6. Fan Zhang, Junwei Cao, Samee U. Khan, Keqin Li, and Kai Hwang, "A task-level adaptive MapReduce framework for real-time streaming data in healthcare application," Future Generation Computer System, pp.149-160, 2015.
  7. Doug Laney, 3D data management: Controlling data volume, velocity, and variety, Technical report, META Group, 2001.
  8. 이미영, 최완, "빅데이터 분석을 위한 빅데이터 처리 기술 동향," 정보처리학회지, 제19권, 제2호, pp.20-28, 2012.
  9. 김현규, 강우람, 김명호, "중첩 윈도우를 가진 데이터 스트림을 위한 효율적인 조인 알고리즘," 정보과학회논문지, 제15권, 제5호, pp.365-369, 2012.
  10. 이욱현, "스트림 데이터에서 회귀분석에 기반한 빈발항목 예측," 한국콘텐츠학회논문지, 제9권, 제1호, pp147-158, 2009.
  11. 김재인, 김대인, 송명진, 한대영, 황부현, "다차원 스트림 데이터 환경에서 이벤트 가중치를 고려한 시간 관계 탐사," 한국콘텐츠학회논문지, 제10권, 제2호, pp.99-110, 2011.
  12. S. Chandrasekar, R. Dakshinamurthy, P. G. Seshakumar, B. Prabavathy, and Chitra Babu, "A Novel Indexing Scheme for Efficient Handling of Small Files in Hadoop Distributed File System," International Conference on Computer Communication and Informatics, pp.1-8, 2013.