DOI QR코드

DOI QR Code

Parallel Rabin Fingerprinting on GPGPU for Efficient Data Deduplication

효율적인 데이터 중복제거를 위한 GPGPU 병렬 라빈 핑거프린팅

  • Received : 2013.11.26
  • Accepted : 2014.06.17
  • Published : 2014.09.15

Abstract

Rabin fingerprinting used for chunking requires the largest amount computation time in data deduplication, In this paper, therefore, we proposed parallel Rabin fingerprinting on GPGPU for efficient data deduplication. In addition, for efficient parallelism in Rabin fingerprinting, four issues are considered. Firstly, when dividing input data stream into data sections, we consider the data located near the boundaries between data sections to calculate Rabin fingerprint continuously. Secondly, we consider exploiting the characteristics of Rabin fingerprinting for efficient operation. Thirdly, we consider the chunk boundaries which can be changed compared to sequential Rabin fingerprinting when adapting parallel Rabin fingerprinting. Finally, we consider optimizing GPGPU memory access. Parallel Rabin fingerprinting on GPGPU shows 16 times and 5.3 times better performance compared to sequential Rabin fingerprinting on CPU and compared to parallel Rabin fingerprinting on CPU, respectively. These throughput improvement of Rabin fingerprinting can lead to total performance improvement of data deduplication.

데이터 중복 제거를 수행하기 위한 여러 단계 중 청킹에 사용되는 라빈 핑거프린트 값을 구하는 단계가 가장 큰 오버헤드를 차지한다. 따라서, 본 논문에서는 효율적인 데이터 중복 제거를 위한 병렬라빈 핑거프린트 방법을 제안한다. 또한 효율적인 라빈 핑거프린팅의 병렬화를 위해 네 가지 이슈를 고려한다. 첫 번째로 병렬처리를 위해 입력 데이터 스트림을 일정한 크기의 데이터 섹션으로 분할할 때, 데이터 섹션의 경계선에 있는 데이터들에 대해서도 라빈 핑거프린팅을 수행하기 위한 고려, 두 번째로 라빈 핑거프린팅 연산 특징을 효율적으로 이용하기 위한 고려, 세 번째로 순차 방식으로 청크 경계선을 구했을 때와 비교하여 병렬 방식으로 청크 경계선을 구했을 때, 변경 될 수 있는 청크 경계선에 대한 고려를 한다. 마지막으로 최적의 GPGPU 메모리 접근을 위한 고려를 한다. GPGPU를 이용한 병렬 라빈 핑거프린트 방식은 CPU를 이용한 순차 라빈 핑거프린트 방식에 비해 약 16배 성능향상을 보였고, CPU를 이용한 병렬 라빈 핑거프린트 방식에 비해서도 약 5.3배 성능향상을 보였다. 이러한 라빈 핑거프린팅 연산 처리량의 증가는 데이터 중복 제거 기법의 전체적인 성능향상을 가져올 수 있다.

Keywords

Acknowledgement

Grant : Human Friendly Devices (Skin Patch, Multi-modal Surface) and Device Social Framework Technology

Supported by : KEIT

References

  1. S. Quinlan and S. Dorward, "Venti: a new approach to archival storage," In Proc of the FAST 2002 Conference on File and Storage Technologies, Vol. 4, 2002.
  2. Quinlan, Sean, and Sean Dorward. "Venti: A New Approach to Archival Storage," FAST, vol.2, 2002.
  3. Muthitacharoen, Athicha, Benjie Chen, and David Mazieres, "A low-bandwidth network file system," ACM SIGOPS Operating Systems Review, Vol. 35, No. 5, ACM, 2001.
  4. R. M. Karp and M. O. Rabin, "Efficient randomized pattern-matching algorithms," IBM Journal of Research and Development, Vol. 31, No. 2, pp. 249-260, 1987. https://doi.org/10.1147/rd.312.0249
  5. Won, Youjip, et al., "Efficient index lookup for De-duplication backup system," In proc of Modeling, Analysis and Simulation of Computers and Telecommunication Systems, 2008. (MASCOTS '08)
  6. http://www.nvidia.com/docs/IO/116711/sc11-perfoptimization.pdf
  7. B. Zhu, et al., "Avoiding the disk bottleneck in the data domain deduplication file system," In Proc of the 6th USENIX Conference on File and Storage Technologies, USENIX Association, 2008, pp. 1-14.
  8. M. Lillibridge, et al., "Sparse indexing: large scale, inline deduplication using sampling and locality," In Proc of the 7th conference on File and storage technologies, USENIX Association, 2009, pp. 111-123.
  9. Xia, Wen, et al., "P-Dedupe: Exploiting Parallelism in Data Deduplication System," Networking, Architecture and Storage (NAS), 2012 IEEE 7th International Conference on, IEEE, 2012.
  10. Kim, Chulmin, et al., "GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system," In Proc of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores, ACM, 2012.
  11. W. Xia, et al., "Accelerating data deduplication by exploiting pipelining and parallelism with multicore or manycore processors," In Proc of the Tenth USENIX Conference on File and Storage Technologies (FAST poster session), 2012.
  12. Bhatotia, Pramod, Rodrigo Rodrigues, and Akshat Verma, "Shredder: GPU-accelerated incremental storage and computation," Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST), 2012.