Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting

Title & Authors
Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting
Cho, Hye-Seung; Kim, Hyoung-Gook;

Abstract
In this paper, we propose a vocal separation method using weighted $\small{{\beta}}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.
Keywords
Vocal separation;Kernel back-fitting;Weighted $\small{{\beta}}$-order MMSE estimation;
Language
Korean
Cited by
References
1.
S. Vembu and S. Baumann "Separation of vocals from polyphonic audio recordings," in Proc. International Society for Music Information Retrieval Conference, 337-344 (2005).

2.
Z. Rafii and B. Pardo, "Repeating pattern extraction technique (REPET): a simple method for music/voice separation," IEEE Trans. Audio, Speech, Language Process. 21, 71-82 (2013).

3.
A. Liutkus, D. Fitzgerald, Z. Raffi, B. Pardo, and L. Daudet, "Kernel additive models for source separation," IEEE Trans. Signal Process. 62, 4298-4310 (2014).

4.
E. Plourde and B. Champagne, "Auditory-based spectral amplitude estimators for speech enhancement," IEEE Trans. Audio, Speech, Language Process. 16, 1614-1623 (2008).

5.
F. Deng, F. Bao, and C.-C. Bao, "Speech enhancement using generalized ${\beta}$-order spectral amplitude estimator," Speech Commun. 59, 55-68 (2014).

6.
E.Vincent, R. Griboncal, and C. Fevotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, Language Process. 14, 1462-1469 (2006).

7.
A. Liutkus, D. Fitzgerald, and Z. Rafii, "Scalable audio separation with light kernel additive modeling," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, 76-80 (2015).