DOI QR코드

DOI QR Code

Design of High Performance Multi-mode 2D Transform Block for HEVC

HEVC를 위한 고성능 다중 모드 2D 변환 블록의 설계

  • Kim, Ki-Hyun (Department of Information and Communication Engineering, Hanbat National University) ;
  • Ryoo, Kwang-Ki (Department of Information and Communication Engineering, Hanbat National University)
  • Received : 2013.12.03
  • Accepted : 2014.01.23
  • Published : 2014.02.28

Abstract

This paper proposes the hardware architecture of high performance multi-mode 2D forward transform for HEVC which has same number of cycles for processing any type of four TUs and yield high throughput. In order to make the original image which has high pixel and high resolution into highly compressed image effectively, the transform technique of HEVC supports 4 kinds of pixel units, TUs and it finds the optimal mode after performs each transform computation. As the proposed transform engine uses the common computation operator which is produced by analyzing the relationship among transform matrix coefficients, it can process every 4 kinds of TU mode matrix operation with 35cycles equally. The proposed transform block was designed by Verilog HDL and synthesized by using TSMC 0.18um CMOS processing technology. From the results of logic synthesis, the maximum operating frequency was 400MHz and total gate count was 214k gates which has the throughput of 10-Gpels/cycle with the $4k(3840{\times}2160)@30fps$ image.

본 논문에서는 4가지의 TU를 동일한 사이클에 처리하는 고성능 다중모드 2D 변환기의 하드웨어 구조를 제안한다. HEVC의 변환 기술은 고해상도, 고화소의 영상을 높은 효율로 압축하기 위해 4가지의 화소 단위 TU를 지원하여 각각의 변환 연산을 수행한 후 최적의 모드를 찾는다. 제안하는 변환기는 변환 행렬 계수들 간의 관계를 분석하여 공통 연산기를 사용한 구조로 설계하여 4가지의 TU 모드 행렬 연산을 처리하는 사이클 수가 동일하게 35cycle로 처리된다. TSMC 018nm CMOS 공정 라이브러리를 사용해 합성한 결과 $4k(3840{\times}2160)@30fps$의 영상을 기준으로 최대 동작주파수는 400MHz이고 총 게이트 수는 214k가 소요되었으며, 10-Gpels/cycle의 처리량을 갖는다.

Keywords

References

  1. B. Bross et al, "High Efficiency Video Coding(HEVC) text specification draft 9," JCTVC-K1003, Shanghai, Oct. 2012.
  2. S. E. Yoo, Y. J. Ahn, D. G. Sim, "Fast HEVC Encoding based on CU-Depth First Decision," Journal of the Institute of Electronics Engineers of Korea, vol. 49, no. 3, pp. 40-50, 2012.
  3. A. Norkin, G. Bjontegaard, A. Fuldseth, M. Narroschke, and M. Ikeda, "HEVC deblocking filter," IEEE Trans. Circuit Syst. Video Technol., Vol. 22, No. 12, pp. 1746-1754, Dec. 2012. https://doi.org/10.1109/TCSVT.2012.2223053
  4. C. Fu, E. Alshina, A. Alshin, Y. Huang, C. Chen, C. Tsai, C. Hsu, S. Lei, J. Park, and W. Han, "Sample adaptive offset in the HEVC standard," IEEE Trans. Circuit Syst. Video Technol., Vol. 22, No. 12, pp. 1755-1764, Dec. 2012. https://doi.org/10.1109/TCSVT.2012.2221529
  5. Y. Yuan, I. Kim, X. Zheng, L. Liu, X. Cao, S. Lee, M. Cheon, T. Lee, Y. He, and J. Park, "Quadtree based non-square block structure for inter frame coding in High Efficiency Video Coding," IEEE Trans. Circuit Syst. Video Technol., Vol. 22, No. 12, pp. 1707-1719, Dec. 2012. https://doi.org/10.1109/TCSVT.2012.2223037
  6. W. H. Chen, C. H. Smith, and S. Fralick, "A fast computational algorithm for the discrete cosine transform," Communicaitons, IEEE Trans. Communications, Vol. 25, No. 9, pp. 1004-1009, 1997.
  7. G. H. Han, and K. K. Ryoo, "The Efficient 32x32 Inverse Transform Design for High Performace HEVC Decoder", Journal of KIICE, Vol. 17, No. 4, pp. 953-958, Apr 2013.
  8. "HM10: High Efficiency video coding HEVC test model 10"(https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware /tags/HM-10.1rc1/ )
  9. Y. H. Chen, T. Y. Chang, and C. W. Lu, "A low-cost and high-throuput architecture for H.264/AVC integer transform by using four computation strams," in Proc. IEEE Int. Symp. Integr. Circuits, pp. 380-383, Dec. 2011.
  10. C. P. Fan, C. H. Fang, C. W.Chang, and S. J. Hsu, "Fast multiple inverse transforms with low-cost hardware sharing design for multistandard video decoding," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 58, no. 8, pp. 517-521, Aug. 2011. https://doi.org/10.1109/TCSII.2011.2158749