DOI QR코드

DOI QR Code

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting

커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법

  • Received : 2015.08.13
  • Accepted : 2015.09.17
  • Published : 2016.01.31

Abstract

In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.

Keywords

Vocal separation;Kernel back-fitting;Weighted ${\beta}$-order MMSE estimation

References

  1. S. Vembu and S. Baumann "Separation of vocals from polyphonic audio recordings," in Proc. International Society for Music Information Retrieval Conference, 337-344 (2005).
  2. Z. Rafii and B. Pardo, "Repeating pattern extraction technique (REPET): a simple method for music/voice separation," IEEE Trans. Audio, Speech, Language Process. 21, 71-82 (2013).
  3. A. Liutkus, D. Fitzgerald, Z. Raffi, B. Pardo, and L. Daudet, "Kernel additive models for source separation," IEEE Trans. Signal Process. 62, 4298-4310 (2014). https://doi.org/10.1109/TSP.2014.2332434
  4. E. Plourde and B. Champagne, "Auditory-based spectral amplitude estimators for speech enhancement," IEEE Trans. Audio, Speech, Language Process. 16, 1614-1623 (2008). https://doi.org/10.1109/TASL.2008.2004304
  5. F. Deng, F. Bao, and C.-C. Bao, "Speech enhancement using generalized ${\beta}$-order spectral amplitude estimator," Speech Commun. 59, 55-68 (2014). https://doi.org/10.1016/j.specom.2014.01.002
  6. E.Vincent, R. Griboncal, and C. Fevotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, Language Process. 14, 1462-1469 (2006). https://doi.org/10.1109/TSA.2005.858005
  7. A. Liutkus, D. Fitzgerald, and Z. Rafii, "Scalable audio separation with light kernel additive modeling," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, 76-80 (2015).

Acknowledgement

Supported by : 정보통신기술진흥센터, 한국연구재단