Pre-Processing for Performance Enhancement of Speech Recognition in Digital Communication Systems

Seo, Jin-Ho;Park, Ho-Chong;

한국음향학회지 (The Journal of the Acoustical Society of Korea)

제24권7호
/
Pages.416-422
/
2005
/
1225-4428(pISSN)
/
2287-3775(eISSN)

한국음향학회 (The Acoustical Society of Korea)

디지털 통신 시스템에서의 음성 인식 성능 향상을 위한 전처리 기술

Pre-Processing for Performance Enhancement of Speech Recognition in Digital Communication Systems

서진호 (광운대학교 전자공학과) ;
박호종 (광운대학교 전자공학과)

발행 : 2005.10.01

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

디지털 통신 시스템에서의 음성 인식은 음성 부호화기에 의한 음성 신호의 왜곡으로 인하여 성능이 크게 저하된다. 본 논문에서는 음성 부호화기에 의한 스펙트럼 왜곡을 분석하고 왜곡된 주파수 정보를 보상하는 전처리 과정을 통하여 음성 인식 성능을 향상시키는 방법을 제안한다. 현재 널리 사용되는 표준 음성 부호화기인 IS-127 EVRC, ITU G.729 CS-ACELP. IS-96 QCELP를 사용하여 부호화에 의한 왜곡을 분석하고, 모든 음성 부호화기에 공통으로 적용하여 왜곡을 보상할 수 있는 전처리 방법을 개발하였다. 본 논문에서 제안하는 왜곡 보상 방법을 세 종류의 음성부호화기에 각각 적용하였으며, 왜곡된 음성 신호에 대한 음성 인식률에 비하여 최대 $15.6\%$의 인식률 향상을 얻을 수 있었다.

Speech recognition in digital communication systems has very low performance due to the spectral distortion caused by speech codecs. In this paper, the spectral distortion by speech codecs is analyzed and a pre-processing method which compensates for the spectral distortion is proposed for performance enhancement of speech recognition. Three standard speech codecs. IS-127 EVRC. ITU G.729 CS-ACELP and IS-96 QCELP. are considered for algorithm development and evaluation, and a single method which can be applied commonly to all codecs is developed. The performance of the proposed method is evaluated for three codecs, and by using the speech features extracted from the compensated spectrum. the recognition rate is improved by the maximum of $15.6\%$ compared with that using the degraded speech features.

키워드

참고문헌

3GPP TS 22.243, 'Speech recognition framework for automated voice services,' Sept. 2003
H. K. Kim and R. V. Cox, 'A bitstream-based front-end for wireless speech recognition on IS-136 communications system,' IEEE Trans. Speech and Audio Processing, 9 (5), July 2001
3GPP TS 23.228, 'IP Multimedia Subsystem (IMS)', March 2004
TIA/EIA/IS-127 'Enhanced variable rate codec. speech service option 3 for wideband spectrum digital systems,' 1996
TIA/EIA/IS-96, 'Speech service option standard for wideband spread spectrum digital cellular system,' 1994
ITU G.729, 'Coding of speech at 8kb/s using coniugate-structure algebraic-code-excited linear prediction,' 1996
한상욱, 박호종, 'Modified PLP Feature를 적용하여 이동 통신 시스템에서의 음성 인식률 향상', 한국음향학회 추계 학술대회, 고려대학교, 2003
H. Herrnanskv, 'Perceptual linear predictive(PLP) analvsis of speech,' J. Acoust. Society of America, 87, 1738-1752, 1990 https://doi.org/10.1121/1.399423

한국음향학회지 (The Journal of the Acoustical Society of Korea)

디지털 통신 시스템에서의 음성 인식 성능 향상을 위한 전처리 기술

Pre-Processing for Performance Enhancement of Speech Recognition in Digital Communication Systems

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)