DOI QR코드

DOI QR Code

Integer-Pel Motion Estimation for HEVC on Compute Unified Device Architecture (CUDA)

  • Lee, Dongkyu (Department of Electronic Engineering, Kwangwoon University) ;
  • Sim, Donggyu (Department of Computer Engineering, Kwangwoon University) ;
  • Oh, Seoung-Jun (Department of Electronic Engineering, Kwangwoon University)
  • Received : 2014.02.20
  • Accepted : 2014.08.28
  • Published : 2014.12.31

Abstract

A new video compression standard called High Efficiency Video Coding (HEVC) has recently been released onto the market. HEVC provides higher coding performance compared to previous standards, but at the cost of a significant increase in encoding complexity, particularly in motion estimation (ME). At the same time, the computing capabilities of Graphics Processing Units (GPUs) have become more powerful. This paper proposes a parallel integer-pel ME (IME) algorithm for HEVC on GPU using the Compute Unified Device Architecture (CUDA). In the proposed IME, concurrent parallel reduction (CPR) is introduced. CPR performs several parallel reduction (PR) operations concurrently to solve two problems in conventional PR; low thread utilization and high thread synchronization latency. The proposed encoder reduces the portion of IME in the encoder to almost zero with a 2.3% increase in bitrate. In terms of IME, the proposed IME is up to 172.6 times faster than the IME in the HEVC reference model.

Keywords

References

  1. B. Bross, W-J. Han, J-R. Ohm, G. J. Sullivan, Y-K. Wang, and T. Wiegand, "High Efficiency Video Coding (HEVC) Text Specification Draft 10," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T VCEG and ISO/IEC MPEG, JCTVC-L1003, Geneva, CH, Jan. 2013.
  2. S. Zhu and K-K. Ma, "A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation," IEEE Trans. Image Process., vol. 9, no. 2, pp. 287-290, Feb. 2000. https://doi.org/10.1109/83.821744
  3. X. Jing and L-P. Chau, "An Efficient Three-Step Search Algorithm for Block Motion Estimation," IEEE Trans. Multimedia, vol. 6, no. 3, pp. 435-438, June 2004. https://doi.org/10.1109/TMM.2004.827517
  4. NVIDIA, CUDA C Programming Guide Ver 6.5, NVIDIA Corp., Santa Clara, CA, Aug. 2014.
  5. D-K. Lee, and S-J. Oh, "Variable Block Size Motion Estimation Implementation on Compute Unified Device Architecture (CUDA)," in Proc. of IEEE Int. Conf. Consum. Electron., pp. 635-636, Jan. 2013.
  6. W-N. Chen and H-M. Hang, "H.264/AVC Motion Estimation Implementation on Compute Unified Device Architecture (CUDA)," in Proc. of IEEE Int. Conf. Multimedia and Expo (ICME), pp. 697-700, Apr. 2008.
  7. Z. Jing, J. Liangbao, and C. Xuehong, "Implementation of Parallel Full Search Algorithm for Motion Estimation on Multi-Core Processors," in Proc. Int. Conf. Next Generation Information Technology, pp. 31-35, June 2011.
  8. E. Monteiro, B. Vizzotto, c. Diniz, M. Maule, B. Zatt, and S. Bampi, "Parallelization of Full Search Motion Estimation Algorithm for Parallel and Distributed Platforms," Int. J. Parallel Prog., vol. 42, no. 2, pp. 239-264, Aug. 2012.
  9. S. Radicke, J. Hahn, C. Grecos, and Q. Wang, "Highly-Parallel HEVC Motion Estimation with CUDA," in Proc. of IEEE European Workshop on Visual Information Processing (EUVIP), pp. 148-153, June 2013.
  10. X-W. Wang, M. Chen, and J-J. Yang, "Paralleling Variable Block Size Motion Estimation of HEVC on Multicore CPU plus GPU Platform," in Proc. of IEEE Int. Conf. Image Process. (ICIP), pp. 1836-1839, Sept. 2013.
  11. S. Radicke, J. Hahn, C. Grecos, and Q. Wang, "A Highly-Parallel Approach on Motion Estimation for High Efficiency Video Coding (HEVC)," in Proc. IEEE Int. Conf. Consum. Electron. (ICCE), pp. 187-188, Jan. 2014.
  12. R. Farber, "CUDA Application Design and Development," Morgan Kaufmann, Waltham, MA, pp. 109-131, 2011.
  13. F. Bossen, "Common Test Conditions and Software Reference Configurations," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T VCEG and ISO/IEC MPEG, JCTVC-L1100, Geneva, CH, Jan. 2013.
  14. M. Harris, "Optimizing Parallel Reduction in CUDA," NVIDIA Developer Technology, 2007.
  15. G. Bjontegaard, "Calculation of average PSNR differences between RD-curves," ITU-T VCEG, VCEG-M33, Austin, TX, April, 2001.

Cited by

  1. Real-time motion estimation diamond search algorithm for the new high efficiency video coding on FPGA pp.1573-1979, 2017, https://doi.org/10.1007/s10470-017-1072-6