- Volume 31 Issue 2
In this paper, a robust feature compensation method to deal with the environmental mismatch is proposed. The proposed method applies energy based weights according to the degree of speech presence to the Mean subtraction, Variance normalization, and ARMA filtering (MVA) processing. The weights are further smoothed by the moving average and maximum filters. The proposed feature compensation algorithm is evaluated on AURORA 2 task and distant talking experiment using the robot platform, and we obtain error rate reduction of 14.4 % and 44.9 % by using the proposed algorithm comparing with MVA processing on AURORA 2 task and distant talking experiment, respectively.
Feature compensation;Temporal modulation filter;ARMA filter
- H. Hermansky, N. Morgan "RASTA processing of speech", IEEE Trans. Speech and Audio Process., vol. 2, no. 4, pp. 578-589, 1994. https://doi.org/10.1109/89.326616
- X. Lu, S. Matsuda, M. Unoki, S. Nakamura, "Temporal contrast normalization and edge-preserved smoothing of temporal modulation structures of speech for robust speech recognition", Speech Comm., vol. 52, no. 1, pp. 1-11, 2010. https://doi.org/10.1016/j.specom.2009.08.006
- C. P. Chen, J. Bilmes, "MVA processing of speech features", IEEE Trans. Audio Speech Language Process., vol. 15, no. 1, pp. 257-270, 2007. https://doi.org/10.1109/TASL.2006.876717
- S. M. Ban, H. S. Kim, "Robust speech recognition using weighted auto-regressive moving average filter", Journal of the Korean Society of Speech Sciences, vol. 2, no. 4, pp. 145-151, 2010.
- H. G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluations of speech recognition systems under noisy conditions," ISCA ITRW ASR2000, Sep. 2000.
- K. B. Kim, N. I. Cho, "Frequency domain multi-channel noise reduction based on the spatial subspace decomposition and noise eigenvalue modification," Speech Comm., vol. 50, no. 5, pp. 382-391, 2008. https://doi.org/10.1016/j.specom.2007.11.004
- ETSI, "Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms," ETSI ES 202 050 Recommendation, 2002.