Phonetic Question Set Generation Algorithm

;;;

The Journal of the Acoustical Society of Korea (한국음향학회지)

Volume 23 Issue 2
/
Pages.173-179
/
2004
/
1225-4428(pISSN)
/
2287-3775(eISSN)

The Acoustical Society of Korea (한국음향학회)

Phonetic Question Set Generation Algorithm

음소 질의어 집합 생성 알고리즘

김성아 (고려대학교 컴퓨터학과 음성정보처리 연구실) ;
육동석 (고려대학교 컴퓨터학과 음성정보처리 연구실) ;
권오일 (현대 오토넷 주식회사)

Published : 2004.02.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Due to the insufficiency of training data in large vocabulary continuous speech recognition, similar context dependent phones can be clustered by decision trees to share the data. When the decision trees are built and used to predict unseen triphones, a phonetic question set is required. The phonetic question set, which contains categories of the phones with similar co-articulation effects, is usually generated by phonetic or linguistic experts. This knowledge-based approach for generating phonetic question set, however, may reduce the homogeneity of the clusters. Moreover, the experts must adjust the question sets whenever the language or the PLU (phone-like unit) of a recognition system is changed. Therefore, we propose a data-driven method to automatically generate phonetic question set. Since the proposed method generates the phone categories using speech data distribution, it is not dependent on the language or the PLU, and may enhance the homogeneity of the clusters. In large vocabulary speech recognition experiments, the proposed algorithm has been found to reduce the error rate by 14.3%.

음소 질의어 집합은 문맥 속에서 비슷한 조음 효과를 보이는 음소들을 분류해 놓은 것으로서, 음성 인식 시스템 학습 시 결정트리를 기반으로 HMM (hidden Markov model)의 상태들을 클러스터링할 때 사용된다. 현재까지의 음소 질의어 집합은 대부분 음성학자나 언어학자들에 의해 수작업으로 제시되어 왔는데, 이러한 지식 기반음소 질의어들은 언어 또는 유사음소 단위 (PLU: phone like unit)에 종속될 뿐 아니라 생성된 클러스터 내의 동질성을 저하시킬 수 있다는 단점이 있다. 본 논문에서는 이와 같은 문제점들을 해결하기 위해 음성 데이터를 사용하여 측정한 음소들 사이의 유사도를 기반으로 언어나 유사음소단위에 상관없이 자동으로 음소 질의어 집합을 생성하는 알고리즘을 제안한다. 실험결과, 제안한 방법으로 생성된 음소 질의어들을 사용한 인식기의 에러율이 약 14.3%감소하여 데이터 기반의 음소 질의어 집합이 상태 클러스터링에 효율적임을 관측하였다.

Keywords

References

IEEE Transactions on Acoustics, Speech, and Signal Processing v.38 no.4 Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition K.Lee
DARPA Human Language Technology Workshop Tree-based state tying for high accuracy acoustic modeling S.Young;J.Odell;P.Woodland
IEEE Transactions on Speech and Audio Processing v.4 no.6 Predicting unseen triphones with senones M.Hwang;X.Huang;F.Alleva
PhD thesis, University of Cambridge The use of context in large vocabulary speech recognition J.Odell
Introduction to Statistical Pattern Recognition K.Fukunaga
Lecture Notes in Computer Science v.2412 Decision tree based clustering D.Yook
Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing v.2 Automatic question generation for decision tree based state tying K.Beulen;H.Ney
Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing v.1 Automatic clustering and generation of contextual questions for tied states in hidden Markov models R.Singh;B.Raj;R.Stern
Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing v.1 Unsupervised incremental online adaptation to unknown environment and speaker D.Yook
Lecture Notes in Computer Science v.2510 Hidden Markov model and neural network hybrid D.Yook

The Journal of the Acoustical Society of Korea (한국음향학회지)

Phonetic Question Set Generation Algorithm

음소 질의어 집합 생성 알고리즘

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)