자동차 환경에서의 단독 숫자음 및 명령어 인식

Isolated Digit and Command Recognition in Car Environment

  • 발행 : 1999.02.01

초록

본 논문에서는 DHMM(Discrete Hidden Markov Model) 기반의 음성 인식 시스템에서 소음에 강인한 인식 성능을 얻기 위하여, 관찰 확률 스무딩(observation probability smoothing) 방법을 제안하고, 자동차 소음하에서의 음성 인식에 적합한 소음처리 기법을 실험을 통해 제시한다. 제안된 관찰 확률 스무딩 방법은 입력되는 음성의 특징벡터가 소음에 오염되어 양자화(vector quantization) 과정에서 적절한 코드워드(codeword)가 아닌 다른 코드워드로 양자화됨으로써 발생하는 인식성능 저하를 막기 위하여, 각각의 코드워드와 거리가 가까운 코드워드들의 관찰 확률값을 높여주는 방법이다. 이 밖에 자동차 소음에 대한 대처 방안으로 특징 벡터의 거리 측정시의 리프터(lifter) 사용, 고역 통과 필터(high pass filter) 사용, 스펙트럴 차감법(spectral subtraction) 사용 등의 성능을 평가한다. 인식 실험은 자동차 정지 중과 주행 중의 두 가지 상황에서 녹음된 한국어 단독 숫자음과 명령어 14단어에 대해 수행하였으며, 정지 중 97.4%와 주행 중 59.1%의 인식률로부터, 제안된 관찰 확률 스무딩 방법과 리프터, 고역 통과 필터, 스팩트럴 차감법의 소음 처리 기법을 추가한 결과, 정지 중 98.3%와 주행 중 88.6%의 인식률을 얻을 수 있었다.

This paper proposes an observation probability smoothing technique for the robustness of a discrete hidden Markov(DHMM) model based speech recognizer. Also, an appropriate noise robust processing in car environment is suggested from experimental results. The noisy speech is often mislabeled during the vector quantization process. To reduce the effects of such mislabelings, the proposed technique increases the observation probability of similar codewords. For the noise robust processing in car environment, the liftering on the distance measure of feature vectors, the high pass filtering, and the spectral subtraction methods are examined. Recognition experiments on the 14-isolated words consists of the Korean digits and command words were performed. The database was recorded in a stopping car and a running car environments. The recognition rates of the baseline recognizer were 97.4% in a stopping situation and 59.1% in a running situation. Using the proposed observation probability smoothing technique, the liftering, the high pass filtering, and the spectral subtraction the recognition rates were enhanced to 98.3% in a stopping situation and to 88.6% in a running situation.

키워드

참고문헌

  1. The Bell System Technical Journal v.62 no.4 An Introduction to the Application of the Theory of Probabilistic Functions on a Markov Process to Automatic Speech Recognition S.E.Levinson;L.R.RaBiner;M.M.Sondhi
  2. Proc. Int. Conf. Acoust. Speech. Signal Processing v.3 Syntax Driven Recognition of Connected Words by Markov Models M.Cravero;L.Fissore;R.Pieraccini;C.Scagliola
  3. Proc. Int. Conf. Acoust. Speech, Signal Processing v.3 Improved Hidden Markov Modeling of Phonemes for Continuous Speech Recognition R.M.Schwartz;J.Klovstad;J.Makhoul;J.Sorensen
  4. Proc. Int. Conf. Acoust., Speech, Signal Processing v.1 Isolated Word Recognition Using Hidden Markov Models K.Sugawara;M.Nishimura;K.Toshioka;M.Okochi;T.Kaneko
  5. Proc. Int. Conf. Acoust., Speech, Signal Processing v.1 Large-Vocabulary Speaker-Independent Continuous Speech Recognition K.F.Lee;H.W.Hon
  6. Proc. Int. Conf. Acoust., Speech, Signal Processing v.1 Comparative Study of Several Distance Measure for Speech Recognition N.Nocerino;F.K.Soong;L.R.Rabiner;D.H.Klatt
  7. IEEE Trans. Acoust., Speech, Signal Processing v.35 no.7 Spectral Slope Distance Measeur with Linear Prediction Analysis for Ward Recognition in Noise B.A.Hanson;H.Wakita
  8. Proc. Int. Conf. Acoust., Speech, Signal Processing v.1 Distance Measure for Speech Recognition Based on the Smoothed Group Delay Spectrum F.Itakura;T.Umezake
  9. IEEE Trans. Acoust., Speech, Signal Processing v.35 no.7 On the Use of Bandpass Liftering in Speech Recognition B.H.Juang;L.R.Rabiner;J.G.Wilpon
  10. Proc. Int. Conf. Acoust., Speech, Signal Processing v.1 A Comparative Study of Cepstral Lifters and Distance Measures for All Pole Models of Speech in Noise J.Junqua;H.Wakita
  11. IEEE Trans. Acoust., Speech, Signal Processing v.27 no.2 Suppression of Acoustic Noise in Speech Using Spectral Subtraction S.F.Boll
  12. Speech Communication v.11 no.2-3 Experiments with a Nonlinear Spectral Subtractor (NSS), Hidden Markov Models and Projection, for Robust Speech Recognition in Cars P.Lockwood;J.Boudy