DSP를 이용한 가라오케용 고음질 멀티채널 오디오 시스템

High Quality Multi-Channel Audio System for Karaoke Using DSP

  • 발행 : 2009.01.31

초록

본 논문에서는 멀티채널 라이브 가라오케의 구현에 관한 내용을 담고 있다. TI사의 32비트 floating 연산 DSP인 TMS320C6713를 이용하여 6 채널의 MP3 복호화 및 템포/키 변환을 실시간으로 구현하였다. 6채널은 전면 L/R 악기, 후면 L/R 악기, 멜로디, 우퍼로 구성되며, 4 채널로 동작 시에는 후면 L/R 대신 드럼 L/R이 추가될 수 있다. 최종 출력 데이터는 5.1 채널 스피커에 맞춰서 출력된다. 템포 변환을 위하여 SOLA알고리즘을 적용시켰으며 시간영역에서 인터폴레이션(interpolation)과 데시메이션 (decimation)으로 키 변환을 수행하였다. 드럼 악기가 추가될 경우에는 일반악기와 분리하여 키 변환 시에 드럼 채널을 제외시키고, SOLA (Synchronized Overlap and Add) 수행 시에도 SOLA처리 단위인 프레임 사이즈를 다르게 두어 고음질의 템포 변환이 가능하도록 하였으며, 실시간 처리를 위하여 최적화를 하였다 6 채널을 이용하여 다양한 채널 구성이 가능하며 본 논문의 멀티채널 오디오 시스템은 고음질의 라이브 반주가 필요한 어느 곳에서나 효과적으로 적용될 수 있다.

This paper deals with the realization of multi-channel live karaoke. In this study, 6-channel MP3 decoding and tempo/key scaling was operated in real time by using the TMS320C6713 DSP, which is 32 bit floating-point DSP made by TI Co. The 6 channel consists of front L/R instrument, rear L/R instrument, melody, and woofer. In case of the 4 channel, rear L/R instrument can be replaced with drum L/R channel. And the final output data is generated as adjusted to a 5.1 channel speaker. The SOLA algorithm was applied for tempo scaling, and key scaling was done with interpolation and decimation in the time domain. Drum channel was excluded in key scaling by separating instruments into drums and non-drums, and in processing SOLA, high-quality tempo scaling was made possible by differentiating SOLA frame size, which was optimized for real-time process. The use of 6 channels allows the composition of various channels, and the multi-channel audio system of this study can be effectively applied at any place where live music is needed.

키워드

참고문헌

  1. ISO/IEC IS, 11172-3, Coding of moving pictures and asso-ciated audio for digital storage media at up to about 1.5 Mbit/s-part3 : Audio, 1992
  2. K. Brandenburg and G. Stall, "lSO-MPEG-1 audio: a generic standard for coding of high-quality digital audio," J. Audio Eng. Soc., vol.42, Oct. 1994, 780-792
  3. S. Roucos and A.M. Wilgus, "High quality time-scale modi-fication for speech," Proc. IEEE lnt. Conf. Acoustics, Speech, and Signal Processing, 493-496, 1985
  4. E. Moulines and F. Charpentier, "Pitch synchronous wave-form processing for text-to-speech synthesis using di-phones," Speech Communication, 9(5/6), 453-469, 1990 https://doi.org/10.1016/0167-6393(90)90021-Z
  5. S. Yim and B.I.Pawate, "Computationally Efficient Algorithm for Time Scale Modification (GLS-TSM)," 1996 IEEE Inte-rnational Conference on Acoustics, Speech and Signal Pro-cessing Conference Processing, 1996 https://doi.org/10.1109/ICASSP.1996.543294
  6. Hamdy, K.N. and Tewfik, A.H. etc. "Time-Scale Modification of Audio Signals With Combined Harmonic and Wavelet Representations," 1997 IEEE International Conference on Acousitcs, Speech, and Signal Processing, 1997 https://doi.org/10.1109/ICASSP.1997.599669
  7. M. Dolson, "The phase vocoder: A tutorial," Computer Music Journal 10(4), 14-27, 1986 https://doi.org/10.2307/3680093
  8. J. L. Flanagan and R. M. Golden, “Phase vocoder,” Bell System Technical J. 45, 1493-1503, 1966 https://doi.org/10.1002/j.1538-7305.1966.tb01706.x
  9. "TMS320C6713 Floating-point Digital Signal Processor - Data Manual," Texas Instruments, 2005
  10. "TMS320C67x/C67x+ DSP CPU and Instruction Set Refe-rence Guide," Texas Instruments, 2005
  11. "TMS320C6000 DSP Multichannel Buffered Serial Port (MC-BSP) Reference Guide," Texas Instruments, 2004
  12. "TMS320C6000 DSP Host Port Interface(HPI) Reference Guide," Texas Instruments, 2005