Search | Korea Science

Improvement of RocksDB Performance via Large-Scale Parameter Analysis and Optimization

Jin, Huijun;Choi, Won Gi;Choi, Jonghwan;Sung, Hanseung;Park, Sanghyun
- Journal of Information Processing Systems
- /
- v.18 no.3
- /
- pp.374-388
- /
- 2022
Database systems usually have many parameters that must be configured by database administrators and users. RocksDB achieves fast data writing performance using a log-structured merged tree. This database has many parameters associated with write and space amplifications. Write amplification degrades the database performance, and space amplification leads to an increased storage space owing to the storage of unwanted data. Previously, it was proven that significant performance improvements can be achieved by tuning the database parameters. However, tuning the multiple parameters of a database is a laborious task owing to the large number of potential configuration combinations. To address this problem, we selected the important parameters that affect the performance of RocksDB using random forest. We then analyzed the effects of the selected parameters on write and space amplifications using analysis of variance. We used a genetic algorithm to obtain optimized values of the major parameters. The experimental results indicate an insignificant reduction (-5.64%) in the execution time when using these optimized values; however, write amplification, space amplification, and data processing rates improved considerably by 20.65%, 54.50%, and 89.68%, respectively, as compared to the performance when using the default settings.
https://doi.org/10.3745/JIPS.04.0244 인용 PDF KSCI

NVM-based Write Amplification Reduction to Avoid Performance Fluctuation of Flash Storage (플래시 스토리지의 성능 지연 방지를 위한 비휘발성램 기반 쓰기 증폭 감소 기법)

Lee, Eunji;Jeong, Minseong;Bahn, Hyokyung
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.16 no.4
- /
- pp.15-20
- /
- 2016
Write amplification is a critical factor that limits the stable performance of flash-based storage systems. To reduce write amplification, this paper presents a new technique that cooperatively manages data in flash storage and nonvolatile memory (NVM). Our scheme basically considers NVM as the cache of flash storage, but allows the original data in flash storage to be invalidated if there is a cached copy in NVM, which can temporarily serve as the original data. This scheme eliminates the copy-out operation for a substantial number of cached data, thereby enhancing garbage collection efficiency. Experimental results show that the proposed scheme reduces the copy-out overhead of garbage collection by 51.4% and decreases the standard deviation of response time by 35.4% on average.
https://doi.org/10.7236/JIIBC.2016.16.4.15 인용 PDF KSCI

Demand-based FTL Cache Partitioning for Large Capacity SSDs (대용량 SSD를 위한 요구 기반 FTL 캐시 분리 기법)

Bae, Jinwook;Kim, Hanbyeol;Im, Junsu;Lee, Sungjin
- IEMEK Journal of Embedded Systems and Applications
- /
- v.14 no.2
- /
- pp.71-78
- /
- 2019
As the capacity of SSDs rapidly increases, the amount of DRAM to keep a mapping table size in SSDs becomes very huge. To address a Demand-based FTL (DFTL) scheme that caches part of mapping entries in DRAM is considered to be a feasible alternative. However, owing to its unpredictable behaviors, DFTL fails to provide consistent I/O response times. In this paper, we a) analyze a root cause that results in fluctuation on read latency and b) propose a new demand-based FTL scheme that ensures guaranteed read response time with low write amplification. By preventing mapping evictions while serving reads, the proposed technique guarantees every host read requests to be done in 2 NAND read operations. Moreover, only with 25% of a cache ratio, the proposed scheme improves random write performance and random mixed performance by 1.65x and 1.15x, respectively, over the traditional DFTL.
https://doi.org/10.14372/IEMEK.2019.14.2.71 인용 PDF KSCI HTML

2R++: Enhancing 2R FTL to Identify Warm Pages (2R++: Warm Page 식별을 통한 2R FTL 개선)

Hyojun, An;Sangwon, Lee
- KIPS Transactions on Computer and Communication Systems
- /
- v.11 no.12
- /
- pp.419-428
- /
- 2022
Since in-place updates for pages are not allowed in flash memory, all new page writes should be written in an out-of-place manner. The old overwritten pages are invalidated. Such invalidated pages eventually trigger the costly garbage collection process. Since the garbage collection causes numerous read and write operations, it is one of the flash memory's major performance issues. In 2R, it modified the garbage collection algorithm, which applies the I/O characteristics of the On-Line Transaction Process workload to improve the Write Amplification Factor. However, this algorithm has a region pollution problem. Therefore, in this paper, we developed 2R++ that additionally separates pages with long access intervals to solve the region pollution problem. 2R++ introduces an extra bit per block to separate warm pages based on a second chance mechanism. Prevents warm pages from being misidentified as cold pages to solve region pollution problem. We conducted the experiments on TPC-C and Linkbench to make the performance comparison. The experiment showed that 2R++ achieved a Write Amplification Factor improvement of 57.8% and 13.8% compared to 2R, respectively.
https://doi.org/10.3745/KTCCS.2022.11.12.419 인용 PDF KSCI

Write Request Handling for Static Wear Leveling in Flash Memory (SSD) Controller

Choo, Chang;Gajipara, Pooja;Moon, Il-Young
- Journal of information and communication convergence engineering
- /
- v.12 no.3
- /
- pp.181-185
- /
- 2014
The lifetime of a solid-state drive (SSD) is limited because of the number of program and erase cycles allowed on its NAND flash blocks. Data cannot be overwritten in an SSD, leading to an out-of-place update every time the data are modified. This result in two copies of the data: the original copy and a modified copy. This phenomenon is known as write amplification and adversely affects the endurance of the memory. In this study, we address the issue of reducing wear leveling through efficient handling of write requests. This results in even wearing of all the blocks, thereby increasing the endurance period. The focus of our work is to logically divert the write requests, which are concentrated to limited blocks, to the less-worn blocks and then measure the maximum number of write requests that the memory can handle. A memory without the proposed algorithm wears out prematurely as compared to that with the algorithm. The main feature of the proposed algorithm is to delay out-of-place updates till the threshold is reached, which results in a low overhead. Further, the algorithm increases endurance by a factor of the threshold level multiplied by the number of blocks in the memory.
https://doi.org/10.6109/jicce.2014.12.3.181 인용 PDF KSCI

Analysis Model of Write Amplification Factor on Flash Memory (플래시 메모리에서 쓰기 증폭 인자 분석 모델)

Lee, Sang-Yup;Kim, Se-Woog;Jeon, Jeong-Ho;Choi, Jong-Moo;Yang, Joong-Seob;Mo, Yeon-Jin;Shin, Young-Kyun
- Proceedings of the Korean Information Science Society Conference
- /
- 2011.06a
- /
- pp.551-554
- /
- 2011
덮어쓰기 제약(overwirte limitation)과 삭제 횟수 제한(limitted erase cycle)이 있는 플래시 메모리의 특징이 의하여, 플래시 메모시는 시스템에서 요청한 쓰기 요청보다 많은 수의 쓰기 연산을 수행하게 되는데, 시스템에서 요청한 쓰기요청과 실제 쓰기 연산 간의 비율을 쓰기 증폭 인자(Write Amplification Factor, 이후 WAF)라 한다. WAF는 성능과 신뢰성에 중요한 요소로 본 논문에서는 WAF를 예측 할 수 있는 분석모델을 제안한다. 제안된 모델은 페이지 사상 FTL, 블록 사상 FTL, 혼합 사상 FTL 등 다양한 FTL에서 WAF를 예측 할 수 있으며, 예측에 사용되는 매개 변수로 이용율(utilization), 무작위율(randomness), 연관도(Associativity)만을 사용하여 단순하다는 특성이 있다. 본 논문은 실제 Linux 환경에서 측정한 WAF와 비교 분석 결과 제안된 모델이 WAF를 정확히 예측 할 수 있음을 발견하였다.

Metadata Log Management for Full Stripe Parity in Flash Storage Systems (플래시 저장 시스템의 Full Stripe Parity를 위한 메타데이터 로그 관리 방법)

Lim, Seung-Ho
- The Journal of Korean Institute of Information Technology
- /
- v.17 no.11
- /
- pp.17-26
- /
- 2019
RAID-5 technology is one of the choice for flash storage device to enhance its reliability. However, RAID-5 has inherent parity update overhead, especially, parity overhead for partial stripe write is one of the crucial issues for flash-based RAID-5 technologies. In this paper, we design efficient parity log architecture for RAID-5 to eliminate runtime partial parity overhead. During runtime, partial parity is retained in buffer memory until full stripe write completed, and the parity is written with full strip write. In addition, parity log is maintained in memory until whole the stripe group is used for data write. With this parity log, partial parity can be recovered from the power loss. In the experiments, the parity log method can eliminate partial parity writes overhead with a little parity log writes. Hence it can reduce write amplification at the same reliability.
https://doi.org/10.14801/jkiit.2019.17.11.17 인용

An Empirical Study on Linux I/O stack for the Lifetime of SSD Perspective (SSD 수명 관점에서 리눅스 I/O 스택에 대한 실험적 분석)

Jeong, Nam Ki;Han, Tae Hee
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.9
- /
- pp.54-62
- /
- 2015
Although NAND flash-based SSD (Solid-State Drive) provides superior performance in comparison to HDD (Hard Disk Drive), it has a major drawback in write endurance. As a result, the lifetime of SSD is determined by the workload and thus it becomes a big challenge in current technology trend of such as the shifting from SLC (Single Level Cell) to MLC (Multi Level cell) and even TLC (Triple Level Cell). Most previous studies have dealt with wear-leveling or improving SSD lifetime regarding hardware architecture. In this paper, we propose the optimal configuration of host I/O stack focusing on file system, I/O scheduler, and link power management using JEDEC enterprise workloads in terms of WAF (Write Amplification Factor) which represents the efficiency perspective of SSD life time especially for host write processing into flash memory. Experimental analysis shows that the optimum configuration of I/O stack for the perspective of SSD lifetime is MinPower-Dead-XFS which prolongs the lifetime of SSD approximately 2.6 times in comparison with MaxPower-Cfq-Ext4, the best performance combination. Though the performance was reduced by 13%, this contributions demonstrates a considerable aspect of SSD lifetime in relation to I/O stack optimization.
https://doi.org/10.5573/ieie.2015.52.9.054 인용 PDF KSCI

Block Allocation Method for Efficiently Managing Temporary Files of Hash Joins on SSDs (SSD상에서 해시조인 임시 파일의 효과적인 관리를 위한 블록 할당 방법)

Joontae, Kim;Sangwon, Lee
- KIPS Transactions on Computer and Communication Systems
- /
- v.11 no.12
- /
- pp.429-436
- /
- 2022
Temporary files are generated when the Hash Join is performed on tables larger than the memory. During the join process, each temporary file is deleted sequentially after it completes the I/O operations. This paper reveals for that the fallocate system call and file deletion-related trim options significantly impact the hash join performance when temporary files are managed on SSDs rather than hard disks. The experiment was conducted on various commercial and research SSDs using PostgreSQL, a representative open-source database. We find that it is possible to improve the join performance up to 3 to 5 times compared to the default combination depending on whether fallocate and trim options are used for temporary files. In addition, we investigate the write amplification and trim command overhead in the SSD according to the combination of the two options for temporary files.
https://doi.org/10.3745/KTCCS.2022.11.12.429 인용 PDF KSCI

Optimizing Garbage Collection Overhead of Host-level Flash Translation Layer for Journaling Filesystems

Son, Sehee;Ahn, Sungyong
- International Journal of Internet, Broadcasting and Communication
- /
- v.13 no.2
- /
- pp.27-35
- /
- 2021
NAND flash memory-based SSD needs an internal software, Flash Translation Layer(FTL) to provide traditional block device interface to the host because of its physical constraints, such as erase-before-write and large erase block. However, because useful host-side information cannot be delivered to FTL through the narrow block device interface, SSDs suffer from a variety of problems such as increasing garbage collection overhead, large tail-latency, and unpredictable I/O latency. Otherwise, the new type of SSD, open-channel SSD exposes the internal structure of SSD to the host so that underlying NAND flash memory can be managed directly by the host-level FTL. Especially, I/O data classification by using host-side information can achieve the reduction of garbage collection overhead. In this paper, we propose a new scheme to reduce garbage collection overhead of open-channel SSD by separating the journal from other file data for the journaling filesystem. Because journal has different lifespan with other file data, the Write Amplification Factor (WAF) caused by garbage collection can be reduced. The proposed scheme is implemented by modifying the host-level FTL of Linux and evaluated with both Fio and Filebench. According to the experiment results, the proposed scheme improves I/O performance by 46%~50% while reducing the WAF of open-channel SSDs by more than 33% compared to the previous one.
https://doi.org/10.7236/IJIBC.2021.13.2.27 인용 PDF KSCI

Search Result 12, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)