Advanced SearchSearch Tips
Research of Performance Interference Control Technique for Heterogeneous Services in Bigdata Platform
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Research of Performance Interference Control Technique for Heterogeneous Services in Bigdata Platform
Jin, Kisung; Lee, Sangmin; Kim, Youngkyun;
In the Hadoop-based Big Data analysis model, the data movement between the legacy system and the analysis system is difficult to avoid. To overcome this problem, a unified Big Data file system is introduced so that a unified platform can support the legacy service as well as the analysis service. However, major challenges in avoiding the performance degradation problem due to the interference of two services remain. In order to solve this problem, we first performed a real-life simulation and observed resource utilization, workload characteristics and I/O balanced level. Based on this analysis, two solutions were proposed both for the system level and for the technical level. In the system level, we divide I/O path into the legacy I/O path and the analysis I/O path. In the technical level, we introduce an aggressive prefetch method for analysis service which requires the sequential read. Also, we introduce experimental results that shows the outstanding performance gain comparing the previous system.
bigdata;filesystem;unified bigdata filesystem;I/O interference;
 Cited by
Borthakur, Dhruba, "The hadoop distributed file system: Architecture and design," Hadoop Project Website, Nov. 2007.

Rabl, Tilmann, and Hans-Arno Jacobsen, "Big data generation," Specifying Big Data Benchmarks, Springer Berlin Heidelberg, pp. 20-27, 2014.

[Online]. Available:

Kim Young Chang, et al., "MAHA-FS: A Distributed File System for High Performance Metadata Processing and Random IO," KIPS Transactions on Software and Data Engineering, Vol. 2, No. 2, pp. 91-96, 2013. crossref(new window)

Hong Yeon Kim. "GLORY-FS: A scale-out storage infrastructure for large scale server virtualization and cloud services," VIOPS, 2012.

Choi, Hyunsik, et al., "Tajo: A distributed data warehouse system on large clusters," 29th IEEE International Conference on Data Engineering (ICDE), 2013.