A NMF-Based Speech Enhancement Method Using a Prior Time Varying Information and Gain Function

- Journal title : The Journal of Korean Institute of Communications and Information Sciences
- Volume 38C, Issue 6, 2013, pp.503-511
- Publisher : The Korean Institute of Communications and Information Sciences
- DOI : 10.7840/kics.2013.38C.6.503

Title & Authors

A NMF-Based Speech Enhancement Method Using a Prior Time Varying Information and Gain Function

Kwon, Kisoo; Jin, Yu Gwang; Bae, Soo Hyun; Kim, Nam Soo;

Kwon, Kisoo; Jin, Yu Gwang; Bae, Soo Hyun; Kim, Nam Soo;

Abstract

This paper presents a speech enhancement method using non-negative matrix factorization. In training phase, we can obtain each basis matrix from speech and specific noise database. After training phase, the noisy signal is separated from the speech and noise estimate using basis matrix in enhancement phase. In order to improve the performance, we model the change of encoding matrix from training phase to enhancement phase using independent Gaussian distribution models, and then use the constraint of the objective function almost same as that of the above Gaussian models. Also, we perform a smoothing operation to the encoding matrix by taking into account previous value. Last, we apply the Log-Spectral Amplitude type algorithm as gain function.

Keywords

speech enhancement;NMF;Gaussian distribution model;smoothing;Log-Spectral Amplitude;

Language

Korean

Cited by

References

1.

Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Trans. Acoust. Speech Signal Process., vol. 33 no. 2, pp. 443-445, Apr. 1985.

2.

I. Cohen and B. Berdugo, "Speech enhancement for non-stationary noise environments," Signal Process. vol. 81, no. 11, pp. 2403-2418, Nov. 2001.

3.

N. S. Kim and J.-H. Chang, "Spectral enhancement based on global soft decision," IEEE Signal Process. Lett. vol. 7, no. 5, pp. 108-110, May 2000.

4.

J.-H. Chang and N.S. Kim, "Noisy speech enhancement based on multiple statistical models," Telecommun. Review, vol. 16, no. 4, pp.731-747, Aug. 2006.

5.

D. D. Lee and H. S. Seung, "Learning the parts of objects by nonnegative matrix factorization," Nature, vol. 401, pp. 788-791, Oct. 1999.

6.

C.-J. Lin, "Projected gradient methods for non-negative matrix factorization," Neural Computation. vol. 19, no. 10, pp. 2756-2779, Oct. 2007.

7.

R. Zdunek and A. Cichocki, "Non-negative matrix factorization with quasi-Newton optimization," in Proc. 8th Int. Conf. Artificial Intell. Soft Comput. (ICAISC 2006), pp. 870-879, Zakopane, Poland, June 2006.

8.

A. Cichocki, R. Zdunek, and S. Amari, "New algorithms for non-negative matrix factorization in application to blind source separation," IEEE Acoust. Speech Signal Process., vol. 5, pp. 14-19, May 2006.

9.

T. Virtanen, "Monaural sound source separation by nonnegative matrix factorization With temporal continuity and sparseness criteria," IEEE Trans. Audio, Speech, Language Process., vol. 15, no. 3, pp. 1066-1074, Mar. 2007.

10.

P. D. O'Grady and B. A. Pearlmutter, "Convolutive non-negative matrix factorization with a sparseness constraint," in Proc. 16th IEEE Signal Process. Soc. Workshop Machine Learning Signal Process., pp. 427-432, Maynooth, Ireland, Sep. 2006.

11.

A. Pascual-Montano, J. M. Carazo, K. Kochi, D. Lehmann, and R. D. Pascual-Marqui, "Nonsmooth nonnegative matrix factorization (nsNMF)," IEEE Trans. Pattern Anal. Machine Intell., vol. 28, no. 3, pp. 403-415, Mar. 2006.

12.

P. O. Hoyer, "Non-negative sparse coding," in Proc. IEEE Workshop Neural Networks for Signal Process., pp. 557-565, Martigny, Switzerland, Sep. 2002.

13.

D. Wang and J. Lim, "The unimportance of phase in speech enhancement," IEEE Trans. Acoust. Speech Signal Process., vol. 30, no. 4, pp. 679-681, Aug. 1982.

14.

K. W. Wilson, B. Raj, P. Smaragdis, and A. Divakaran, "Speech denoising using nonnegative matrix factorization with priors," in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. 2008, pp. 4029-4032, Las Vegas, U.S.A., Apr. 2008.