Quality Improvement of Low-Bitrate HE-AAC Encoder

HE-AAC 부호화의 저비트율에서 음질향상 기법

  • 김정근 (연세대학교 전기전자공학부 디지털신호처리연구실) ;
  • 이재성 (연세대학교 전기전자공학부 디지털신호처리연구실) ;
  • 이태진 (전자통신연구원 방송미디어연구그룹) ;
  • 강경옥 (전자통신연구원 방송미디어연구그룹) ;
  • 박영철 (연세대학교 컴퓨터정보통신공학부)
  • Published : 2008.02.29

Abstract

In this paper, we propose new techniques that can improve the quality of AAC and SBR encoders comprised in low bitrate HE-AAC. To reduce the pre-echo artifacts often occurring for transient blocks in AAC, we propose an extended Temporal Noise Shaping (sTNS) in which the frequency range is selectively extended down to the low-frequency region. Also, for he high-frequency region being coded by SBR encoder, tones are identified through a sinusoidal modeling and their frequencies are adjusted within the QMF band in order to reduce the noise floor due to aliasing. Spectrograms of the decoded signals were compared and listening tests were conducted to evaluate the proposed algorithm. Results confirmed the effectiveness of the proposed algorithm.

본 논문에서는 HE-AAC (High Efficiency Advanced Audio Coding) 오디오 부호화기의 저주파 대역과 고주파 대역을 담당하고 있는 AAC부호화기와 SBR (Spectral Band Replication) 부호화기에 대해 낮은 비트율에서 효과적인 개선 방법을 제안한다. AAC 부호화기가 담당하는 저주파 대역에서 과도신호가 발생하는 부분의 프리에코를 줄이기 위하여 적용 주파수범위가 저주파 대역 방향으로 선택적으로 확장되는 eTNS (exteded Temporal Noise Shaping) 방법을 고안하였다. 또한 SBR에 의해 부호화되는 고주파 대역에서 톤 성분 복원 시에 잡음층 (Noise floor)이 추가 발생되지 않도록 정현파 모델을 통해 톤을 사전 인지하고 인지된 톤들의 주파수를 QMF 밴드의 중앙으로 재배치하여 성능 향상을 이루었다. 제안한 방법들을 사용하여 복호화한 샘플 음원들에 대해 주/객관적 음질평가를 실시한 결과, 표준 HE-AAC에 비해 향상된 결과를 보여주었다.

Keywords

References

  1. M. Wolters, K. Kj rling, D. Homm and H. Purnhagen, "A Closer Look into MPEG-4 High Efficiency AAC," AES 115th Convention, New York, October 2003
  2. ISO/IEC, "Text of ISO/IEC 14496-3:2001 / FPDAM 1, Bandwidth extensions," ISO/IEC JTC1/SC29/WG11/N5203, October 2002
  3. M. Dietz, L. Liljeryd, K. Kj rling and O. Kunz, "Spectral Band Replication, A Novel Approach in Audio Coding," AES 112nd Convention, Munich, 2002 May 10-13
  4. ISO/IEC, "International Standard ISO/IEC IS 13818-7, Information technology GenericCoding of Moving Pictures and Associated Audio:, Part 7: Advanced Audio Coding (AAC)", ISO/IEC JTC1/SC29/WG11, 1997
  5. Makinen, J. and Bessette, B. and Bruhn, S. and Ojala, P. and Salami, R. and Taleb, A., "AMR-WB+ : a new audio coding standard for 3rd generation mobile audio services", Proc IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05)
  6. 박호종, 박영철, 홍진우, "음성 및 오디오 통합 부호화 기술", Telecommunications Review, 17(5), 841-854, 2007. 10
  7. Chang, Chia-Ming,Hsu, Han-Wen , "Compression Artifacts in Perceptual Audio Coding", AES 121th Convention, San Francisco, October 2006
  8. 3rd Generation Partnership Project, "Enhanced aacPlus encoder SBR part 3GPP TS 26.404," 3rd Generation Partnership Project Technical Specification Group Services and System Aspects, September 2004
  9. http://www.3gpp.org/ftp/Specs/html-info/26410.htm
  10. 3rd Generation Partnership Project, "Advanced Audio Coding (AAC) part 3GPP TS 26.403," 3rd Generation Partnership Project Technical Specification Group Services and System Aspects, June 2006
  11. Robert J. McAulay and Thomas F. Quatieri, "Speech Analysis /Synthesis Based on a Sinusoidal Representation", IEEE transactions on acoustics, speech and signal processing, 34(4), 744-754, august 1986 https://doi.org/10.1109/TASSP.1986.1164910
  12. E. Zwicker and H. Fastl, Springer-Verlag, Berlin Heidelberg 1990
  13. N. Jayant, J. Johnston and R. Safranek, "Signal Compression Based on Method of Human Perception", Proc. Of IEEE, 81(10), 1385-1422, October 1993
  14. ITU-R, "Method for the subjective assessment of intermediate quality level of coding systems (MUSHRA)", ITU-R Recommend, BS. 1534, 2001