DOI QR코드

DOI QR Code

MPEG-H 3D Audio Decoder Structure and Complexity Analysis

MPEG-H 3D 오디오 표준 복호화기 구조 및 연산량 분석

  • Moon, Hyeongi (School of Electrical and Electronic Engineering, Yonsei University) ;
  • Park, Young-cheol (Computer and Telecommunications Engineering Division, Yonsei University) ;
  • Lee, Yong Ju (Audio & Acoustics Research Section, Broadcasting.Media Research Laboratory, ETRI) ;
  • Whang, Young-soo (Department of Electronic and Communication Engineering, Kwandong University)
  • Received : 2016.07.25
  • Accepted : 2016.12.28
  • Published : 2017.02.28

Abstract

The primary goal of the MPEG-H 3D Audio standard is to provide immersive audio environments for high-resolution broadcasting services such as UHDTV. This standard incorporates a wide range of technologies such as encoding/decoding technology for multi-channel/object/scene-based signal, rendering technology for providing 3D audio in various playback environments, and post-processing technology. The reference software decoder of this standard is a structure combining several modules and can operate in various modes. Each module is composed of independent executable files and executed sequentially, real time decoding is impossible. In this paper, we make DLL library of the core decoder, format converter, object renderer, and binaural renderer of the standard and integrate them to enable frame-based decoding. In addition, by measuring the computation complexity of each mode of the MPEG-H 3D-Audio decoder, this paper also provides a reference for selecting the appropriate decoding mode for various hardware platforms. As a result of the computational complexity measurement, the low complexity profiles included in Korean broadcasting standard has a computation complexity of 2.8 times to 12.4 times that of the QMF synthesis operation in case of rendering as a channel signals, and it has a computation complexity of 4.1 times to 15.3 times of the QMF synthesis operation in case of rendering as a binaural signals.

MPEG-H 3D 오디오 표준은 UHDTV 등의 초고해상도 방송서비스에 대응하는 실감음향 서비스의 제공을 목표로 한다. 이를 위해 본 표준은 다채널 신호, 객체 신호, 장면 기반 신호의 부호화/복호화 기술과 다양한 재생 환경에서 3차원 오디오 제공을 위한 렌더링 기술, 후처리 기술 등 방대한 기술을 통합하였다. 본 표준의 참조 소프트웨어 복호화기는 여러 모듈들이 결합된 구조로 다양한 모드에서 동작이 가능하며, 각 모듈들이 독립된 실행파일로 순차적으로 실행되어 실시간 처리가 불가능하다. 본 논문에서는 MPEG-H 3D 오디오의 코어 복호화기, 포맷 변환기, 객체 렌더러, 바이노럴 렌더러의 각 함수를 동적 라이브러리화 및 통합하여 프레임 기반 복호화가 가능하도록 하였다. 또한 MPEG-H 3D 오디오의 각 모드별 연산량을 측정하여 다양한 하드웨어 플랫폼에서 적합한 모드를 선택하기 위한 참고 자료를 제공한다. 연산량 분석 결과, 한국 방송 표준에 포함된 저연산량 프로파일은 채널 신호로 렌더링을 할 경우 QMF 합성 연산의 2.8배에서 12.4배의 연산량을 가지며, 바이노럴 렌더링을 할 경우 QMF 합성 연산의 4.1배에서 15.3배의 연산량을 가진다.

Keywords

Acknowledgement

Grant : 초고품질 콘텐츠 지원 UHD 실감방송/디지털시네마/사이니지 융합서비스 기술 개발

Supported by : 정보통신기술진흥센터

References

  1. ISO/IEC 23008-3:2015, Information technology - High efficiency coding and media delivery in heterogeneous environments - Part 3: 3D audio, AMENDMENT 2: MPEG-H 3D Audio File Format Support.
  2. Y. S. Kim, H. Lee, E. D. Lee, and G. Lee, "A viewing zone analysis of a time-multiplex auto-stereoscopic multi-view 3D display," in Proc. KICS Winter Conf., pp. 955-956, Korea, Jan. 2016.
  3. ISO/IEC 23003-3:2012, Information technology - MPEG audio technologies - Part 3: Unified speech and audio coding.
  4. ITU Recommendation BS.1534-3, Method for the Subjective Assesment of Intermediate Quality Levels of Coding Systems.
  5. J. Herre, J. Hilpert, A. Kuntz, and J. Plogsties, "MPEG-H 3D audio - The new standard for coding of immersive spatial audio," IEEE J. Sel. Topics in Sign. Process., vol. 9, no. 5, pp. 770-779, Aug. 2015. https://doi.org/10.1109/JSTSP.2015.2411578
  6. J. Seo, K. Kang, and D. G. Jeong, "Overview of MPEG 3D audio standard activities for high-order multichannel realistic audio service," in Proc. Korea Broadcast Eng., pp. 170-172, Korea, 2012.
  7. V. Pulkki, "Virtual sound source positioning using vector base amplitude panning," J. Audio Eng. Soc., vol. 45, no. 6, p. 456, Jun. 1997.
  8. ISO/IEC 23003-2:2010, Information technology - MPEG audio technologies - Part 2: Spatial Audio Object Coding.
  9. ISO/IEC 23004-4:2007, Information technology - Multimedia Middleware - Part 4: Dynamic range control.
  10. T. J. Lee, K. O. Kang, and W. W. Kim, "MPEG audio new standard: USAC technology," J. Broadcast Eng., vol. 16, no. 5, pp. 693-704, Sept. 2011. https://doi.org/10.5909/JEB.2011.16.5.693
  11. M. Neuendorf, et al., "The ISO/MPEG unified speech and audio coding standard consistent high quality for all content types and at all bit rates," J. Audio Eng. Soc., vol. 61, no. 12, pp. 956-977, Dec. 2013.
  12. S. K. Zielinski, F. Rumsey, and S. Bech, "Effects of down-mix algorithms on quality of surround sound," J. Audio Eng. Soc., vol. 51, no. 9, pp. 780-798, Sept. 2003.
  13. V. Pulkki, "Generic panning tools for MAX/MSP," in Proc. Int. Comput. Music Conf., pp. 304-307, Berlin, Germany, Aug- Sept. 2000.
  14. S. Y. Lim, J. M. Seok, and J. I. Seo, "Tiled panoramic video transmission system based on MPEG-DASH," in Proc. KICS Int. Conf. Commun., pp. 804-805, Korea, Jun. 2015.
  15. ISO/IEC 23004-1:2007, Information technology - Multimedia Middleware - Part 1: Architecture.