• Title/Summary/Keyword: Hardware Compression

Search Result 194, Processing Time 0.02 seconds

Energy Efficient and Low-Cost Server Architecture for Hadoop Storage Appliance

  • Choi, Do Young;Oh, Jung Hwan;Kim, Ji Kwang;Lee, Seung Eun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4648-4663
    • /
    • 2020
  • This paper proposes the Lempel-Ziv 4(LZ4) compression accelerator optimized for scale-out servers in data centers. In order to reduce CPU loads caused by compression, we propose an accelerator solution and implement the accelerator on an Field Programmable Gate Array(FPGA) as heterogeneous computing. The LZ4 compression hardware accelerator is a fully pipelined architecture and applies 16 dictionaries to enhance the parallelism for high throughput compressor. Our hardware accelerator is based on the 20-stage pipeline and dictionary architecture, highly customized to LZ4 compression algorithm and parallel hardware implementation. Proposing dictionary architecture allows achieving high throughput by comparing input sequences in multiple dictionaries simultaneously compared to a single dictionary. The experimental results provide the high throughput with intensively optimized in the FPGA. Additionally, we compare our implementation to CPU implementation results of LZ4 to provide insights on FPGA-based data centers. The proposed accelerator achieves the compression throughput of 639MB/s with fine parallelism to be deployed into scale-out servers. This approach enables the low power Intel Atom processor to realize the Hadoop storage along with the compression accelerator.

Multi-threaded system to support reconfigurable hardware accelerators on Zynq SoC (Zynq SoC에서 재구성 가능한 하드웨어 가속기를 지원하는 멀티쓰레딩 시스템 설계)

  • Shin, Hyeon-Jun;Lee, Joo-Heung
    • Journal of IKEEE
    • /
    • v.24 no.1
    • /
    • pp.186-193
    • /
    • 2020
  • In this paper, we propose a multi-threading system to support reconfigurable hardware accelerators on Zynq SoC. We implement high-performance JPEG decoder with reconfigurable 2D IDCT hardware accelerators to achieve maximum performance available on the platform. In this system, up to four reconfigurable hardware accelerators synchronized with SW threads can be dynamically reconfigured to provide adaptive computing capabilities according to the given image resolution and the compression ratio. JPEG decoding is operated using images with resolutions 480p, 720p, 1080p at the compression ratio of 7:1-109:1. We show that significant performance improvements are achieved as the image resolution or the compression ratio increase. For 1080p resolution, the performance improvement is up to 79.11 times with throughput speed of 99 fps at the compression ratio 17:1.

A Consistent Quality Bit Rate Control for the Line-Based Compression

  • Ham, Jung-Sik;Kim, Ho-Young;Lee, Seong-Won
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.5
    • /
    • pp.310-318
    • /
    • 2016
  • Emerging technologies such as the Internet of Things (IoT) and the Advanced Driver Assistant System (ADAS) often have image transmission functions with tough constraints, like low power and/or low delay, which require that they adopt line-based, low memory compression methods instead of existing frame-based image compression standards. Bit rate control in the conventional frame-based compression systems requires a lot of hardware resources when the scope of handled data falls at the frame level. On the other hand, attempts to reduce the heavy hardware resource requirement by focusing on line-level processing yield uneven image quality through the frame. In this paper, we propose a bit rate control that maintains consistency in image quality through the frame and improves the legibility of text regions. To find the line characteristics, the proposed bit rate control tests each line for ease of compression and the existence of text. Experiments on the proposed bit rate control show peak signal-to-noise ratios (PSNRs) similar to those of conventional bit rate controls, but with the use of significantly fewer hardware resources.

Optimal Selection of Wavelet Coefficients for Electrocardiograph Compression

  • Del Mar Elena, Maria;Quero, Jose Manuel;Borrego, Inmaculada
    • ETRI Journal
    • /
    • v.29 no.4
    • /
    • pp.530-532
    • /
    • 2007
  • This paper presents a simple method to implement a complete on-line portable wireless holter including an electrocardiogram (ECG) monitoring, processing, and communication protocol. The proposed algorithm significantly reduces the hardware resources of threshold estimation for ECG compression, using the standard deviation updated with each new input signal sample. The new method achieves superior performance in terms of hardware complexity, channel occupation and memory requirements, while keeping the ECG quality at a clinically acceptable level.

  • PDF

Improvement of Image Sensor Performance through Implementation of JPEG2000 H/W for Optimal DWT Decomposition Level

  • Lee, Choel;Kim, BeomSu;Jeon, ByungKook
    • International journal of advanced smart convergence
    • /
    • v.6 no.1
    • /
    • pp.68-75
    • /
    • 2017
  • In this paper, a particular application of digital photos, remote sensing, remote shooting air moving, high-resolution and high compression of medical images required by remote shooting of JPEG2000 standard applied in the field of hardware design, production was implemented. JPEG2000 standard for image compression using the software implementation of the processing speed is very slow compared to conventional JPEG disadvantages, and also the standard of JPEG2000 DWT (Discrete wavelet transform) to improve the level of compression for image data if processing speed is a phenomenon that has degraded. In order to solve these JPEG2000 compression / decompression groups were designed and applied. In this paper, the optimal JPEG2000 compression / reservoir hardware by changing the level for still image compression, faster computation speed and quality has shown improvement.

The Cooperative Parallel X-Match Data Compression Algorithm (협동 병렬 X-Match 데이타 압축 알고리즘)

  • 윤상균
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.10
    • /
    • pp.586-594
    • /
    • 2003
  • X-Match algorithm is a lossless compression algorithm suitable for hardware implementation owing to its simplicity. It can compress 32 bits per clock cycle and is suitable for real time compression. However, as the bus width increases 64-bit, the compression unit also need to increase. This paper proposes the cooperative parallel X-Match (X-MatchCP) algorithm, which improves the compression speed by performing the two X-Match algorithms in parallel. It searches the all dictionary for two words, combines the compression codes of two words generated by parallel X-Match compression and outputs the combined code while the previous parallel X-Match algorithm searches an individual dictionary. The compression ratio in X-MatchCP is almost the same as in X-Match. X-MatchCP algorithm is described and simulated by Verilog hardware description language.

High-Speed Intra Prediction VLSI Implementation for HEVC (HEVC 용 고속 인트라 예측 VLSI 구현)

  • Jo, Hyeonsu;Hong, Youpyo;Jang, Hanbeyoul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.11
    • /
    • pp.1502-1506
    • /
    • 2016
  • HEVC (High Efficiency Video Coding) is a recently proposed video compression standard that has a two times greater coding efficiency than previous video compression standards. The key factors of high compression performance and increasement of computational complexity are the various types of block partitions and modes of intra prediction in HEVC. This paper presents an intra prediction hardware architecture for HEVC utilizing pipelining and interleaving techniques to increase the efficiency and performance while reducing the requirement for hardware resources.

Data compression algorithm with two-byte codeword representation (2바이트 코드워드 표현방법에 의한 자료압축 알고리듬)

  • 양영일;김도현
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.3
    • /
    • pp.23-36
    • /
    • 1997
  • In tis paper, sthe new data model for the hardware implementation of lempel-ziv compression algorithm was proposed. Traditional model generates the codeword which consists of 3 bytes, the last symbol, the position and the matched length. MSB (most significant bit) of the last symbol is the comparession flag and the remaining seven bits represent the character. We confined the value of the matched length to 128 instead of 256, which can be coded with seven bits only. In the proposed model, the codeword consists of 2 bytes, the merged symbol and the position. MSB of the merged symbol is the comression flag. The remaining seven bits represent the character or the matched length according to the value of the compression flag. The proposed model reduces the compression ratio by 5% compared with the traditional model. The proposed model can be adopted to the existing hardware architectures. The incremental factors of the compression ratio are also analyzed in this paper.

  • PDF

A Twin Symbol Encoding Technique Based on Run-Length for Efficient Test Data Compression

  • Park, Jae-Seok;Kang, Sung-Ho
    • ETRI Journal
    • /
    • v.33 no.1
    • /
    • pp.140-143
    • /
    • 2011
  • Recent test data compression techniques raise concerns regarding power dissipation and compression efficiency. This letter proposes a new test data compression scheme, twin symbol encoding, that supports block division skills that can reduce hardware overhead. Our experimental results show that the proposed technique achieves both a high compression ratio and low-power dissipation. Therefore, the proposed scheme is an attractive solution for efficient test data compression.

Sparse Matrix Compression Technique and Hardware Design for Lightweight Deep Learning Accelerators (경량 딥러닝 가속기를 위한 희소 행렬 압축 기법 및 하드웨어 설계)

  • Kim, Sunhee;Shin, Dongyeob;Lim, Yong-Seok
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.17 no.4
    • /
    • pp.53-62
    • /
    • 2021
  • Deep learning models such as convolutional neural networks and recurrent neual networks process a huge amounts of data, so they require a lot of storage and consume a lot of time and power due to memory access. Recently, research is being conducted to reduce memory usage and access by compressing data using the feature that many of deep learning data are highly sparse and localized. In this paper, we propose a compression-decompression method of storing only the non-zero data and the location information of the non-zero data excluding zero data. In order to make the location information of non-zero data, the matrix data is divided into sections uniformly. And whether there is non-zero data in the corresponding section is indicated. In this case, section division is not executed only once, but repeatedly executed, and location information is stored in each step. Therefore, it can be properly compressed according to the ratio and distribution of zero data. In addition, we propose a hardware structure that enables compression and decompression without complex operations. It was designed and verified with Verilog, and it was confirmed that it can be used in hardware deep learning accelerators.