DOI QR코드

DOI QR Code

SSD 기반 스토리지 시스템에서 중복률과 입출력 성능 향상을 위한 데이터 중복제거 및 재활용 기법

Data De-duplication and Recycling Technique in SSD-based Storage System for Increasing De-duplication Rate and I/O Performance

  • 투고 : 2012.10.25
  • 발행 : 2012.12.25

초록

SSD(Solid State Disk)는 다수의 NAND 플래시 메모리로 구성되었으며 내부에 고성능 컨트롤러와 캐시 버퍼를 포함한 스토리지 장치이다. NAND 플래시 메모리는 제자리 덮어쓰기가 안되기 때문에 파일시스템에서 유효페이지가 갱신 및 삭제시 무효페이지로 전환되어 완전히 삭제하기 위해서는 가비지 컬렉션 과정을 거쳐야한다. 하지만 가비지 컬렉션은 지연시간이 긴 Erase 연산을 포함하기 때문에 SSD의 I/O 성능을 감소시키고 마모도를 증가시키는 문제가 된다. 본 논문에서는 입력데이터에 대하여 유효데이터와 무효데이터에서 중복검사를 실행하는 기법을 제안한다. 먼저 유효데이터에 대한 중복제거 과정을 거치고 그 다음에 무효데이터 재활용 과정을 거침으로써 중복률을 향상시켰다. 이를 통하여 SSD의 쓰기 횟수와 가비지 컬렉션 횟수를 감소시켜 마모도와 I/O 성능이 개선되었다. 실험결과 제안한 기법은 유효데이터 중복제거와 무효데이터 재활용을 둘다 하지 않는 일반적인 경우에 비해서 가비지 컬렉션 횟수가 최대 20% 감소하고 I/O 지연시간이 9% 감소하였다.

SSD is a storage device of having high-performance controller and cache buffer and consists of many NAND flash memories. Because NAND flash memory does not support in-place update, valid pages are invalidated when update and erase operations are issued in file system and then invalid pages are completely deleted via garbage collection. However, garbage collection performs many erase operations of long latency and then it reduces I/O performance and increases wear leveling in SSD. In this paper, we propose a new method of de-duplicating valid data and recycling invalid data. The method de-duplicates valid data and then recycles invalid data so that it improves de-duplication ratio. Due to reducing number of writes and garbage collection, the method could increase I/O performance and decrease wear leveling in SSD. Experimental result shows that it can reduce maximum 20% number of garbage collections and 9% I/O latency than those of general case.

키워드

참고문헌

  1. N. Agrawal, V. Prabhakan, T. Wobber, J. D. Davis, M. Manasse and R. Panigrahy, "Design Tradeoffs for SSD Performance," USENIX'08 ATC, 57-70p, 2008.
  2. G. Wu, X. He and B. Eckart, "An Adaptive Write Buffer Management Scheme for Flash-Based SSDs," ACM Transactions on Storage, Vol.8, No.1, 1-24p, 2012.
  3. J.-Y. Shin, Z.-L. Xia, N.-Y. Xu, R. Gao, X.-F. Cai, S. Maeng, F.-H. Hsu, "FTL Design Exploration in Reconfigurable High-Performance SSD for Server Applications," ACM ICS'09, 338 -349p, 2009.
  4. A. Berman, Y. Birk, "Integrating De-duplication and Write for Increased Performance and Endurance of Solid-State Drives," IEEE 26th IEEEI, 821-823p, 2010.
  5. J. Lee, Y. Kim, G. M. Shipman, S. Oral, F. Wang and J. Kim, "A Semi-Preemtive Garbage Collector for Solid State Drives," IEEE ISPASS, 12-21p, 2011.
  6. J. Kim, S. Lee, P. Mehdi, D. Kim, "Recycling Invalid Data Method for Improving I/O Performance in SSD Storage System," KIISE KCC 2012, Vol.39, No.1(A), 230-232p, 2012.
  7. F. Chen, T. Luo, X. Zhang, "CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives," USENIX FAST'11, 2011.
  8. B. Debnath, S. Sengupta, J. Li, "ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory," USENIX ATC'10, 2010.
  9. D. Meister, A. Brinkmann, "dedupv1: Improving Deduplication Throughput using Solid State Drives(SSD)," IEEE MSST, 1-6p, 2010.
  10. D. Bhagwat, K. Eshghi, D. D. E. Long, M. Lillibridge, "Extreme Binning: Scalable, Parallel Deduplication for Chunk-based File Backup," IEEE MASCOTS'09, 1-9, 2009.
  11. H. E. Michail, A. P. Kakarountas, A. Milidonis, C. E. Goutis, "Efficient Implementation of the Keyed-Hash Message Authentication Code(HMAC) Using the SHA-1 Hash Function," IEEE ICECS, 567-570p, 2004.
  12. Q. He, Z. Li, X. Zhang, "Data Deduplication Techniques," IEEE FITME, 430-433p, 2010.
  13. C.-H. Wu, H.-S. Wu, "A Data De-duplication Access Framework for Solid State Drives," ACM SAC'11, 600-604p, 2011.
  14. G. Wu, X. He, "$\Delta$FTL: Improving SSD Lifetime via Exploiting Content Locality," ACM EuroSys'12, 253-265p, 2012.
  15. O. Kwon, K. Koh, "Swqp Space Management Technique for Portable Consumer Electronics with NAND Flash Memory," IEEE Transactions on Consumer Electronics, Vol.56, No.3, 1524- 1531p, 2010. https://doi.org/10.1109/TCE.2010.5606292
  16. J.-S. Song, J.-M. Huh, Y.-S. Yang, D.-H. Kim, "SSD-based RAID-6 System Architecture for Reliability and Performance Enhancement," IEEK, Vol.47, CI, No.6, 589-598p, 2010.
  17. Y.-S. Yang, D.-H. Kim, "Data allocation and Replacement Method based on The Access Frequency for Improving The Performance of SSD," IEEK, Vol.48, CI, No.5, 528-536p, 2011.