Search | Korea Science

Improvement of Domain-specific Keyword Spotting Performance Using Hybrid Confidence Measure (하이브리드 신뢰도를 이용한 제한 영역 핵심어 검출 성능향상)

이경록;서현철;최승호;최승호;김진영
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.7
- /
- pp.632-640
- /
- 2002
In this paper, we proposed ACM (Anti-filler confidence measure) to compensate shortcoming of conventional RLJ-CM (RLJ-CM) and NCM (normalized CM), and integrated proposed ACM and conventional NCM using HCM (hybrid CM). Proposed ACM analyzes that FA (false acceptance) happens by the construction method of anti-phone model, and presumed phoneme sequence in actuality using phoneme recognizer to compensate this. We defined this as anti-phone model and used in confidence measure calculation. Analyzing feature of two confidences measure, conventional NCM shows good performance to FR (false rejection) and proposed ACM shows good performance in FA. This shows that feature of each other are complementary. Use these feature, we integrated two confidence measures using weighting vector α And defined this as HCM. In MDR (missed detection rate) 10% neighborhood, HCM is 0.219 FA/KW/HR (false alarm/keyword/hour). This is that Performance improves 22% than used conventional NCM individually.
PDF KSCI

Verification of Normalized Confidence Measure Using n-Phone Based Statistics

Kim, Byoung-Don;Kim, Jin-Young;Na, Seung-You;Choi, Seung-Ho
- Speech Sciences
- /
- v.12 no.1
- /
- pp.123-134
- /
- 2005
Confidence measure (CM) is used for the rejection of mis-recognized words in an automatic speech recognition (ASR) system. Rahim, Lee, Juang and Cho's confidence measure (RLJC-CM) is one of the widely-used CMs [1]. The RLJC-CM is calculated by averaging phone-level CMs. An extension of the RLJC-CM was achieved by Kim et al [2]. They devised the normalized CM (NCM), which is a statistically normalized version of the RLJC-CM by using the tri-phone based CM normalization. In this paper we verify the NCM by generalizing tri-phone to n-phone unit. To apply various units for the normalization, mono-phone, tri-phone, quin-phone and $\infty$-phone are tested. By the experiments in the domain of the isolated word recognition we show that tri-phone based normalization is sufficient enough to enhance the rejection performance of the ASR system. Also we explain the NCM in regard to two class pattern classification problems.
PDF

Improvement of Rejection Performance using the Lip Image and the PSO-NCM Optimization in Noisy Environment (잡음 환경 하에서의 입술 정보와 PSO-NCM 최적화를 통한 거절 기능 성능 향상)

Kim, Byoung-Don;Choi, Seung-Ho
- Phonetics and Speech Sciences
- /
- v.3 no.2
- /
- pp.65-70
- /
- 2011
Recently, audio-visual speech recognition (AVSR) has been studied to cope with noise problems in speech recognition. In this paper we propose a novel method of deciding weighting factors for audio-visual information fusion. We adopt the particle swarm optimization (PSO) to weighting factor determination. The AVSR experiments show that PSO-based normalized confidence measures (NCM) improve the rejection performance of mis-recognized words by 33%.
PDF

Improvement of Keyword Spotting Performance Using Normalized Confidence Measure (정규화 신뢰도를 이용한 핵심어 검출 성능향상)

Kim, Cheol;Lee, Kyoung-Rok;Kim, Jin-Young;Choi, Seung-Ho;Choi, Seung-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.4
- /
- pp.380-386
- /
- 2002
Conventional post-processing as like confidence measure (CM) proposed by Rahim calculates phones' CM using the likelihood between phoneme model and anti-model, and then word's CM is obtained by averaging phone-level CMs[1]. In conventional method, CMs of some specific keywords are tory low and they are usually rejected. The reason is that statistics of phone-level CMs are not consistent. In other words, phone-level CMs have different probability density functions (pdf) for each phone, especially sri-phone. To overcome this problem, in this paper, we propose normalized confidence measure. Our approach is to transform CM pdf of each tri-phone to the same pdf under the assumption that CM pdfs are Gaussian. For evaluating our method we use common keyword spotting system. In that system context-dependent HMM models are used for modeling keyword utterance and contort-independent HMM models are applied to non-keyword utterance. The experiment results show that the proposed NCM reduced FAR (false alarm rate) from 0.44 to 0.33 FA/KW/HR (false alarm/keyword/hour) when MDR is about 8%. It achieves 25% improvement of FAR.
PDF KSCI

Rejection Performance Analysis in Vocabulary Independent Speech Recognition Based on Normalized Confidence Measure (정규화신뢰도 기반 가변어휘 고립단어 인식기의 거절기능 성능 분석)

Choi, Seung-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.2
- /
- pp.96-100
- /
- 2006
Kim et al. Proposed Normalized Confidence Measure (NCM) [1-2] and it was successfully used for rejecting mis-recognized words in isolated word recognition. However their experiments were performed on the fixed word speech recognition. In this Paper we apply NCM to the domain of vocabulary independent speech recognition (VISP) and shows the rejection Performance of NCM in VISP. Specialty we Propose vector quantization (VQ) based method for overcoming the problem of unseen triphones. It is because NCM uses the statistics of triphone confidence in the case of triphone-based normalization. According to speech recognition experiments Phone-based normalization method shows better results than RLJC[3] and also triphone-based normalization approach. This results are different with those of Kim et al [1-2]. Concludingly the Phone-based normalization shows robust Performance in VISP domain.
PDF KSCI

Enhancement of Rejection Performance using the PSO-NCM in Noisy Environment (잡음 환경하에서의 PSO-NCM을 이용한 거절기능 성능 향상)

Kim, Byoung-Don;Song, Min-Gyu;Choi, Seung-Ho;Kim, Jin-Young
- Speech Sciences
- /
- v.15 no.4
- /
- pp.85-96
- /
- 2008
Automatic speech recognition has severe performance degradation under noisy environments. To cope with the noise problem, many methods have been proposed. Most of them focused on noise-robust features or model adaptation. However, researchers have overlooked utterance verification (UV) under noisy environments. In this paper we discuss UV problems based on the normalized confidence measure. First, we show that UV performance is also degraded in noisy environments with the experiments of an isolated word recognition. Then we observe how the degradation of UV performances is suffered. Based on the UV experiments we propose a modeling method of the statistics of phone confidences using sigmoid functions. For obtaining the parameters of the sigmoidal models, the particle swarm optimization (PSO) is adopted. The proposed method improves 20% rejection performance. Our experimental results show that the PSO-NCM can apply noise speech recognition successfully.
PDF

In Out-of Vocabulary Rejection Algorithm by Measure of Normalized improvement using Optimization of Gaussian Model Confidence (미등록어 거절 알고리즘에서 가우시안 모델 최적화를 이용한 신뢰도 정규화 향상)

Ahn, Chan-Shik;Oh, Sang-Yeob
- Journal of the Korea Society of Computer and Information
- /
- v.15 no.12
- /
- pp.125-132
- /
- 2010
In vocabulary recognition has unseen tri-phone appeared when recognition training. This system has not been created beginning estimation figure of model parameter. It's bad points could not be created that model for phoneme data. Therefore it's could not be secured accuracy of Gaussian model. To improve suggested Gaussian model to optimized method of model parameter using probability distribution. To improved of confidence that Gaussian model to optimized of probability distribution to offer by accuracy and to support searching of phoneme data. This paper suggested system performance comparison as a result of recognition improve represent 1.7% by out-of vocabulary rejection algorithm using normalization confidence.
https://doi.org/10.9708/jksci.2010.15.12.125 인용 PDF KSCI

Performance Comparison of Out-Of-Vocabulary Word Rejection Algorithms in Variable Vocabulary Word Recognition (가변어휘 단어 인식에서의 미등록어 거절 알고리즘 성능 비교)

김기태;문광식;김회린;이영직;정재호
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.2
- /
- pp.27-34
- /
- 2001
Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is an important technology to design a user-friendly speech recognition system. We propose a new utterance verification algorithm for no-training utterance verification system based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we create no-training anti-phoneme models which include many PLUs(Phoneme Like Units), so anti-phoneme models have the minimum verification error. Then, for OOV (Out-Of-Vocabulary) rejection, the phoneme-based confidence measure which uses the likelihood between phoneme model (null hypothesis) and anti-phoneme model (alternative hypothesis) is normalized by null hypothesis, so the phoneme-based confidence measure tends to be more robust to OOV rejection. And, the word-based confidence measure which uses the phoneme-based confidence measure has been shown to provide improved detection of near-misses in speech recognition as well as better discrimination between in-vocabularys and OOVs. Using our proposed anti-model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89％, and CR (Correctly Reject for OOV) is about 90％, improving about 15-21％ in ERR (Error Reduction Rate).
PDF

Measurement and Decomposition of Socioeconomic Inequality in Metabolic Syndrome: A Cross-sectional Analysis of the RaNCD Cohort Study in the West of Iran

Moslem Soofi;Farid Najafi;Shahin Soltani;Behzad Karamimatin
- Journal of Preventive Medicine and Public Health
- /
- v.56 no.1
- /
- pp.50-58
- /
- 2023
Objectives: Socioeconomic inequality in metabolic syndrome (MetS) remains poorly understood in Iran. The present study examined the extent of the socioeconomic inequalities in MetS and quantified the contribution of its determinants to explain the observed inequality, with a focus on middle-aged adults in Iran. Methods: This cross-sectional study used data from the Ravansar Non-Communicable Disease cohort study. A sample of 9975 middleaged adults aged 35-65 years was analyzed. MetS was assessed based on the International Diabetes Federation definition. Principal component analysis was used to construct socioeconomic status (SES). The Wagstaff normalized concentration index (CI_n) was employed to measure the magnitude of socioeconomic inequalities in MetS. Decomposition analysis was performed to identify and calculate the contribution of the MetS inequality determinants. Results: The proportion of MetS in the sample was 41.1%. The CI_n of having MetS was 0.043 (95% confidence interval, 0.020 to 0.066), indicating that MetS was more concentrated among individuals with high SES. The main contributors to the observed inequality in MetS were SES (72.0%), residence (rural or urban, 46.9%), and physical activity (31.5%). Conclusions: Our findings indicated a pro-poor inequality in MetS among Iranian middle-aged adults. These results highlight the importance of persuading middle-aged adults to be physically active, particularly those in an urban setting. In addition to targeting physically inactive individuals and those with low levels of education, policy interventions aimed at mitigating socioeconomic inequality in MetS should increase the focus on high-SES individuals and the urban population.
https://doi.org/10.3961/jpmph.22.373 인용 PDF

Search Result 9, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)