DOI QR코드

DOI QR Code

Dynamic Load Management Method for Spatial Data Stream Processing on MapReduce Online Frameworks

맵리듀스 온라인 프레임워크에서 공간 데이터 스트림 처리를 위한 동적 부하 관리 기법

  • Jeong, Weonil (Division of Computer and Information Engineering, Hose University)
  • 정원일 (호서대학교 컴퓨터정보공학부)
  • Received : 2018.04.13
  • Accepted : 2018.08.03
  • Published : 2018.08.31

Abstract

As the spread of mobile devices equipped with various sensors and high-quality wireless network communications functionsexpands, the amount of spatio-temporal data generated from mobile devices in various service fields is rapidly increasing. In conventional research into processing a large amount of real-time spatio-temporal streams, it is very difficult to apply a Hadoop-based spatial big data system, designed to be a batch processing platform, to a real-time service for spatio-temporal data streams. This paper extends the MapReduce online framework to support real-time query processing for continuous-input, spatio-temporal data streams, and proposes a load management method to distribute overloads for efficient query processing. The proposed scheme shows a dynamic load balancing method for the nodes based on the inflow rate and the load factor of the input data based on the space partition. Experiments show that it is possible to support efficient query processing by distributing the spatial data stream in the corresponding area to the shared resources when load management in a specific area is required.

다양한 센서를 내장하고 고품질의 무선 네트워크 통신 기능을 탑재한 이동 장치의 보급이 확대됨에 따라 다양한 서비스 환경에서 이동 장치로부터 생성되는 시공간 데이터 량도 빠르게 증가하고 있다. 이와 같이 실시간 특성을 갖는 대량의 공간 데이터 스트림을 처리하기 위한 기존의 연구에서 하둡 기반의 공간 빅 데이터 시스템은 일괄 처리 방식의 플랫폼으로 공간 데이터 스트림에 대한 실시간 서비스에 적용하기에는 매우 어렵다. 이에 본 논문에서는 맵리듀스 온라인 프레임워크를 확장하여 연속적으로 입력되는 공간 데이터 스트림에 대한 실시간 질의 처리를 지원하고, 질의 처리 과정에서 야기될 수 있는 부하 문제를 효과적으로 분산하는 부하 관리 기법을 제안한다. 제안 기법에서는 공간 분할 영역을 기반으로 입력 데이터의 유입율과 부하율을 이용하여 노드들에 대해 동적으로 부하를 분산하는 기법을 제시하였다. 실험에서는 특정 공간 영역에서의 부하 관리가 요구될 때 해당 영역에서의 공간 데이터 스트림을 공유하는 자원들에게 분배함으로써 효과적인 질의 처리를 지원할 수 있음을 보인다.

Keywords

References

  1. J. Abdul, M. Alkathiri and M. B. Potdar, "Geospatial Hadoop (GS-Hadoop) an efficient mapreduce based engine for distributed processing of shapefiles", Advances in Computing, Communication, & Automation (ICACCA) (Fall), International Conference, pp. 1-7, 2016. DOI: https://doi.org/10.1109/ICACCAF.2016.7748956
  2. J. M. Park, M. H. Lee, D. B. Shin and J. W. Ahn, "Deduction of the Policy Issues for Activating the Geo-Spatial Big Data Services", Journal of Korea Spatial Information Society, vol. 23, no. 6, pp. 19-29, 2015. DOI: https://doi.org/10.12672/ksis.2015.23.6.019
  3. A. Aji, H. Vo, W. Fusheng, R. Lee, X. Zhang and J. Saltz, "Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce", VLDB Endowment, vol. 6, no. 11, pp. 1009-1020, 2013. DOI: https://doi.org/10.14778/2536222.2536227
  4. A. Eldawy, Alarabi, and M. F. Mokbel, "SpatialHaddop: A MapReduce Framework for Spatial Data", Data Engineering (ICDE), 2015 IEEE 31st International Conference on 2015, pp. 1352-1363, 2015. DOI: https://doi.org/10.1109/ICDE.2015.7113382
  5. In-Hak Joo, "Spatial Big Data Query Processing System Supporting SQL-based Query Language in Hadoop", Journal of Korea institute of information, electronics, and communication technology vol. 10, no. 1, pp. 1-8, 2017. https://doi.org/10.17661/jkiiect.2017.10.1.1
  6. G. H. Kim, J. H. Yoon, C. M. Jun and H. C. Jung, "Providing Service Model Based on Concept and Requirements of Spatial Big Data", Journal of the Korean Society for Geospatial Information Science. vol. 24, no. 4, pp. 89-96, 2016. DOI: https://doi.org/10.7319/kogsis.2016.24.4.089
  7. Apache Hadoop, http://hadoop.apache.org/
  8. J. Dean, S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters", Proc. of the 6th Symposium on Operating Systems Design and Implementation, pp. 137-150, 2004.
  9. A. M. Aly, A. Sallam, B. M. Gnanasekaran, L. Nguyen-Dinh, W. G. Aref, M. Ouzzani and A. Ghafoor, "M3: Stream Processing on Main-Memory MapReduce", Data Engineering (ICDE), IEEE 28th International Conference, pp. 1253-1256, 2012. DOI: https://doi.org/10.1109/ICDE.2012.120
  10. D. Jeong, S. Jeon and B. Hong, "A Study on MapReduce Processing for Multi-dimensional Continuous Query", Lecture notes in computer science, vol. 7827, pp. 74-78, 2013. DOI: https://doi.org/10.1007/978-3-642-40270-8_6
  11. D. Yang, J. Cao, S. Wu and J. Wang, "Progressive online aggregation in a distributed stream system", The Journal of systems and software, vol. 102, pp. 146-157, 2015. DOI: https://doi.org/10.1007/978-3-642-40270-8_6
  12. X. Song, J. Gao, J. Ma, S. Niu and H. He, "HTME: A data streams processing strategy based on Hoeffding tree in MapReduce environment", Intelligent Control and Automation(WCICA), pp. 1042-1045, 2016.
  13. S. Park, W. Ryu, B. Hong and J Kwon, "MapReduce-based Stream Assigning and Splitting Technique for Stream Data Processing", Journal of KIISE, vol. 19, no. 8, pp. 439-443, 2013.
  14. K. Madsen and Y. Zhou, "Dynamic resource management in a MapReduce-style platform for fast data processing", Data Engineering Workshops(ICDEW), 31st IEEE International Conference, pp. 10-13, 2015. DOI: https://doi.org/10.1109/ICDEW.2015.7129537
  15. T. Condie, N. Conway, P. ALvaro and J.M. Hellerstein, "MapReduce Online", NSDI'10, 2010.
  16. S. Baek, D. Lee, G. Kim, W. Chung and H. Bae, "Load Shedding Method based on Grid Hash to Improve Accuracy of Spatial Sliding Window Aggregate Queries", Journal of KSIS, vol. 11, no. 2, pp. 89-98, 2009.
  17. H. Kim, S. Baek, D. Lee, G. Kim, H. Bae, "Pre-filtering based Post-Load Shedding Method for Improving Spatial Query Accuracy in GeoSensor Environment", Journal of KSISS, vol. 12, no. 1, pp. 18-27, 2010.
  18. W. Jeong, "Dynamic Load Shedding Scheme based on Input Rate of Spatial Data Stream and Data Density", Journal of KAIS, vol. 16, no. 3, pp. 2158-2164, 2015. DOI: https://doi.org/10.5762/KAIS.2015.16.3.2158
  19. R. A. Finkel and J. L. Bentley, "Quad trees a data structure for retrieval on composite keys", Acta informatica, vol. 4, no. 1, pp. 1-9, 1974. DOI: https://doi.org/10.1007/BF00288933