통합 검색 | Korea Science

Speech Recognition in Car Noise Environments Using Multiple Models Based on a Hybrid Method of Spectral Subtraction and Residual Noise Masking

Song, Myung-Gyu;Jung, Hoi-In;Shim, Kab-Jong;Kim, Hyung-Soon
- The Journal of the Acoustical Society of Korea
- /
- 제18권3E호
- /
- pp.3-8
- /
- 1999
In speech recognition for real-world applications, the performance degradation due to the mismatch introduced between training and testing environments should be overcome. In this paper, to reduce this mismatch, we provide a hybrid method of spectral subtraction and residual noise masking. We also employ multiple model approach to obtain improved robustness over various noise environments. In this approach, multiple model sets are made according to several noise masking levels and then a model set appropriate for the estimated noise level is selected automatically in recognition phase. According to speaker independent isolated word recognition experiments in car noise environments, the proposed method using model sets with only two masking levels reduced average word error rate by 60% in comparison with spectral subtraction method.
PDF

DSP를 이용한 자동차 소음에 강인한 음성인식기 구현 (Implementation of a Robust Speech Recognizer in Noisy Car Environment Using a DSP)

정익주
- 음성과학
- /
- 제15권2호
- /
- pp.67-77
- /
- 2008
In this paper, we implemented a robust speech recognizer using the TMS320VC33 DSP. For this implementation, we had built speech and noise database suitable for the recognizer using spectral subtraction method for noise removal. The recognizer has an explicit structure in aspect that a speech signal is enhanced through spectral subtraction before endpoints detection and feature extraction. This helps make the operation of the recognizer clear and build HMM models which give minimum model-mismatch. Since the recognizer was developed for the purpose of controlling car facilities and voice dialing, it has two recognition engines, speaker independent one for controlling car facilities and speaker dependent one for voice dialing. We adopted a conventional DTW algorithm for the latter and a continuous HMM for the former. Though various off-line recognition test, we made a selection of optimal conditions of several recognition parameters for a resource-limited embedded recognizer, which led to HMM models of the three mixtures per state. The car noise added speech database is enhanced using spectral subtraction before HMM parameter estimation for reducing model-mismatch caused by nonlinear distortion from spectral subtraction. The hardware module developed includes a microcontroller for host interface which processes the protocol between the DSP and a host.
PDF

전하 펌프의 전류 부정합 감소를 위한 피드포워드 방식 (A Feed-forward Method for Reducing Current Mismatch in Charge Pumps)

이재환;정항근
- 전자공학회논문지SC
- /
- 제46권1호
- /
- pp.63-67
- /
- 2009
전하 펌프의 전류 부정합은 위상 고정 루프의 주파수 성분에 기준 스퍼를 발생시킴으로써 특성을 떨어뜨리게 한다. 전류 부정합은 캐스코드 출력단과 같이 전하 펌프의 출력 저항을 높여줌으로써 감소시킬 수 있다. 그러나 공급 전압이 낮아짐에 따라 트랜지스터를 쌓기 힘들어지게 된다. 본 논문에서는 전류 부정합을 줄이기 위한 새로운 방법을 제안하였다. 제안한 방법은 출력 단의 채널 길이 변조에 의한 전류 변화를 피드포워드 방식으로 보상해 주는 것이다. 새로운 방법에 대한 시뮬레이션은 CMOS $0.18{\mu}m$ 공정을 이용하였다.
PDF KSCI

코퍼스 기반 음성합성기를 위한 합성단위 경계 스펙트럼 평탄화 알고리즘 (A Spectral Smoothing Algorithm for Unit Concatenating Speech Synthesis)

김상진;장경애;한민수
- 대한음성학회지:말소리
- /
- 제56호
- /
- pp.225-235
- /
- 2005
Speech unit concatenation with a large database is presently the most popular method for speech synthesis. In this approach, the mismatches at the unit boundaries are unavoidable and become one of the reasons for quality degradation. This paper proposes an algorithm to reduce undesired discontinuities between the subsequent units. Optimal matching points are calculated in two steps. Firstly, the fullback-Leibler distance measurement is utilized for the spectral matching, then the unit sliding and the overlap windowing are used for the waveform matching. The proposed algorithm is implemented for the corpus-based unit concatenating Korean text-to-speech system that has an automatically labeled database. Experimental results show that our algorithm is fairly better than the raw concatenation or the overlap smoothing method.
PDF

Spectral Subtraction Using Spectral Harmonics for Robust Speech Recognition in Car Environments

Beh, Jounghoon;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- 제22권2E호
- /
- pp.62-68
- /
- 2003
This paper addresses a novel noise-compensation scheme to solve the mismatch problem between training and testing condition for the automatic speech recognition (ASR) system, specifically in car environment. The conventional spectral subtraction schemes rely on the signal-to-noise ratio (SNR) such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, these schemes are based on the postulation that the power spectrum of noise is in general at the lower level in magnitude than that of speech. Therefore, while such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as that of car environment. This paper proposes an efficient spectral subtraction scheme focused specifically to low SNR noisy environment by extracting harmonics distinctively in speech spectrum. Representative experiments confirm the superior performance of the proposed method over conventional methods. The experiments are conducted using car noise-corrupted utterances of Aurora2 corpus.
PDF KSCI

Spectral Folding방법과 GMM 변환을 이용한 대역폭 확장의 Hybrid 방법 (The Hybrid Bandwidth Extenstion Method Using Spectral Folding and GMM Transformation)

최무열;김형순
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2006년도 춘계 학술대회 발표논문집
- /
- pp.131-134
- /
- 2006
The narrowband speech over the telephone network is lacking in the information from low-band (0-300 Hz) and high-band (3400-8000 Hz) that are found in wideband speech (0-8000 Hz). As a result, narrowband speech is characterized by the reduced intelligibility and muffled quality, and degraded speaker identification. Spectral folding is the easiest way to reconstruct the missing high-band; however, the reconstructed speech still brings the sense of band-limited characteristic because of the absence of low-band and mid-band frequency components. To compensate for the lack of the extended speech, we propose to combine the spectral folding method and GMM transformation method, which is a statistical method to reconstruct wideband speech. The reconstructed wideband speech showed that the absent frequency components was filled up with relatively low spectral mismatch. According to the subjective speech quality evaluations, the proposed method was preferred to other methods.
PDF

주기적으로 분극반전된 $LiNbO_3$에서 군속도 일치와 의사위상정합에 의한 펨토초 펄스의 효율적인 2차 조화파발생 (Effective frequency doubling of fs-pulse with simultaneous group velocity matching and quasi-phase matching in periodically poled lithium niobate)

Lee, Yu-Nan;S. Kurimura;K. Kitamura;Hun, No-Jeong;Sik, Cha-Myeong
- 한국광학회:학술대회논문집
- /
- 한국광학회 2003년도 제14회 정기총회 및 03년 동계학술발표회
- /
- pp.224-225
- /
- 2003
Since group velocity (GV) mismatch significantly limits the efficiency of nonlinear interactions such as second harmonic generation (SHG), several techniques have been developed to compensate GV mismatch. The simplest way to avoid the GV mismatch problem is to reduce the device length. However, it results in a poor trade-off between the SHG spectral bandwidth and the conversion efficiency. (omitted)
PDF

히스토그램 기반의 과추정 방식을 이용한 잡음에 강인한 음성인식 (Noise-Robust Speech Recognition Using Histogram-Based Over-estimation Technique)

권영욱;김형순
- 한국음향학회지
- /
- 제19권6호
- /
- pp.53-61
- /
- 2000
잡음환경에서의 음성인식 성능향상을 위해서는 서로 다른 잡음환경으로 인한 mismatch를 줄이는 것이 중요하다. 이를 위해 계산이 간단하고 잡음환경에서 비교적 우수한 성능을 내고 있는 스펙트럼 차감법이 널리 사용되고 있다. 본 논문에서는 스펙트럼 차감법을 적용하기 위한 잡음 스펙트럼 추정방법으로 히스토그램 처리방법을 도입한다. 히스토그램 처리방법은 음성이 아닌 구간의 검출이 필요없으며 시간에 따라 변화하는 시변잡음에도 적용 가능한 장점이 있다. 그러나 히스토그램 처리방법으로 신뢰도 높은 잡음 스펙트럼의 평균값을 추정하더라도 스펙트럼 차감법을 적용했을 때의 잔여 잡음의 문제가 발생한다. 이를 해결하기 위하여 잡음추정 과정에 사용되었던 히스토그램의 분포특성을 고려한 새로운 over-estimation 적용방식을 제안한다. 제안된 방식은 측정된 잡음의 분포에 따라 적응적으로 over-estimation의 정도를 결정함으로써 SNR 변화에 따른 영향이 적은 장점이 있다. 자동차 소음 환경에서의 화자독립 고립단어 인식실험 결과, 기존의 over-estimation factor를 적용한 경우보다 제안된 방식의 인식성능이 개선되었다.
PDF

Harmonics-based Spectral Subtraction and Feature Vector Normalization for Robust Speech Recognition

Beh, Joung-Hoon;Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
- 음성과학
- /
- 제11권1호
- /
- pp.7-20
- /
- 2004
In this paper, we propose a two-step noise compensation algorithm in feature extraction for achieving robust speech recognition. The proposed method frees us from requiring a priori information on noisy environments and is simple to implement. First, in frequency domain, the Harmonics-based Spectral Subtraction (HSS) is applied so that it reduces the additive background noise and makes the shape of harmonics in speech spectrum more pronounced. We then apply a judiciously weighted variance Feature Vector Normalization (FVN) to compensate for both the channel distortion and additive noise. The weighted variance FVN compensates for the variance mismatch in both the speech and the non-speech regions respectively. Representative performance evaluation using Aurora 2 database shows that the proposed method yields 27.18% relative improvement in accuracy under a multi-noise training task and 57.94% relative improvement under a clean training task.
PDF

히스토그램 기반의 Over-estimation을 이용한 잡음환경에서의 음성인식 (Speech Recognition in Noisy Environrrents using Histogram-based Over-estimation)

권영욱
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1998년도 제15회 음성통신 및 신호처리 워크샵(KSCSP 98 15권1호)
- /
- pp.262-266
- /
- 1998
In the speech recognition under the noisy environments, reducing the mismatch introduced between training and testing environments is an important issue, and spectral subtraction is widely used technique because of its simplicity and relatively good performance in noisy environments. In this paper, we introduced histogram method as a reliable noise estimationi approach for spectral subtraction. To deal with the problem of residual noise after spectral subtraction, we proposed a new ove-estimation technique based on distribution characteristics of histogram used for noise estimation. Since the proposed technique decides the degree of over-estimation adaptively according to the measured noise distribution, it can cope with the SNR variations effectively in compared with the conventional over-estimation technique.
PDF

검색결과 23건 처리시간 0.024초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)