Advanced SearchSearch Tips
Non-uniform Linear Microphone Array Based Source Separation for Conversion from Channel-based to Object-based Audio Content
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
  • Journal title : Journal of Broadcast Engineering
  • Volume 21, Issue 2,  2016, pp.169-179
  • Publisher : The Korean Institute of Broadcast and Media Engineers
  • DOI : 10.5909/JBE.2016.21.2.169
 Title & Authors
Non-uniform Linear Microphone Array Based Source Separation for Conversion from Channel-based to Object-based Audio Content
Chun, Chan Jun; Kim, Hong Kook;
  PDF(new window)
Recently, MPEG-H has been standardizing for a multimedia coder in UHDTV (Ultra-High-Definition TV). Thus, the demand for not only channel-based audio contents but also object-based audio contents is more increasing, which results in developing a new technique of converting channel-based audio contents to object-based ones. In this paper, a non-uniform linear microphone array based source separation method is proposed for realizing such conversion. The proposed method first analyzes the arrival time differences of input audio sources to each of the microphones, and the spectral magnitudes of each sound source are estimated at the horizontal directions based on the analyzed time differences. In order to demonstrate the effectiveness of the proposed method, objective performance measures of the proposed method are compared with those of conventional methods such as an MVDR (Minimum Variance Distortionless Response) beamformer and an ICA (Independent Component Analysis) method. As a result, it is shown that the proposed separation method has better separation performance than the conventional separation methods.
Sound source separation;channel-based audio content;object-based audio content;non-uniform linear microphone array;frequency-dependent source separation;
 Cited by
J. Herre, J. Hilpert, A. Kuntz, and J, Plogsties, “MPEG-H 3D audio—the new standard for coding of immersive spatial audio,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 5, pp. 770-779, Aug. 2015. crossref(new window)

J. Herre, J. Hilpert, A. Kuntz, and J. Plogsties, “MPEG-H audio—the new standard for universal spatial/3D audio coding,” Journal of the Audio Engineering Society, vol. 62, no. 12, pp. 821-830, Dec. 2014. crossref(new window)

S. Makino, T.-W. Lee, and H. Sawada, Blind Speech Separation, Springer, Netherlands, 2007.

A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis, John Wiley & Sons, Inc., Canada, 2001.

D. F. Rosenthal and H. G. Okuno, Computational Auditory Scene Analysis, LEA Publishers, Mahwah, NJ, 1998.

H. Cox, R. M. Zeskind, and M. M. Owen, “Robust adaptive beamforming,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 35, no. 10, pp. 1365-1375, Oct. 1987. crossref(new window)

O. Yilmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830-1847, July 2004. crossref(new window)

H. Adel, M. Souad, A. Alaqeeli, and A. Hamid, “Beamforming techniques for multichannel audio signal separation,” International Journal of Digital Content Technology and its Applications, vol. 6, no. 20, pp. 659-667, Nov. 2012. crossref(new window)

D. Barry, B. Lawlor, and E. Coyle, "Sound source separation: azimuth discrimination and resynthesis," in Proceedings of International Conference on Digital Audio Effects (DAFX-04), pp. 1-5, Naples, Italy, Oct. 2004.

C. J. Chun and H. K. Kim, "Sound source separation using interaural intensity difference in real environments," in Proceedings of 135th Audio Engineering Society (AES) Convention, Preprint 8976, New York, NY, Oct. 2013.

E. Vincent, R. Gribonval, and C. Fevotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1462–1469, July 2006. crossref(new window)

A. L. Casanovas, G. Monaci, P. Vandergheynst, and R. Gribonval, “Blind audiovisual source separation based on sparse redundant representations,” IEEE Transactions on Multimedia, vol. 12, no. 5, pp. 358-371, Aug. 2010. crossref(new window)

J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Springer, Berlin, Germany, 2008.

M. Brandstein and D. Ward, Microphone Arrays: Signal Processing Techniques and Applications, Springer, Berlin, Germany, 2001.

A. Hyvärinen and E. Oja, “A fast fixed-point algorithm for independent component analysis,” Neural Computation, vol. 9, no. 7, pp. 1483-1492, Oct. 1997. crossref(new window)

J. Breebaart, and C. Faller, Spatial Audio Processing: MPEG Surround and Other Applications, John Wiley & Sons, Ltd., Chichester, UK, 2007.

J. Dmochowski, J. Benesty, and S. Affes, “On spatial aliasing in microphone arrays,” IEEE Transactions on Signal Processing, vol. 57, no. 4, pp. 1383-1395, Apr. 2009. crossref(new window)