• Title/Summary/Keyword: Automatic Lipreading

Search Result 9, Processing Time 0.03 seconds

Improved Automatic Lipreading by Stochastic Optimization of Hidden Markov Models (은닉 마르코프 모델의 확률적 최적화를 통한 자동 독순의 성능 향상)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.14B no.7
    • /
    • pp.523-530
    • /
    • 2007
  • This paper proposes a new stochastic optimization algorithm for hidden Markov models (HMMs) used as a recognizer of automatic lipreading. The proposed method combines a global stochastic optimization method, the simulated annealing technique, and the local optimization method, which produces fast convergence and good solution quality. We mathematically show that the proposed algorithm converges to the global optimum. Experimental results show that training HMMs by the method yields better lipreading performance compared to the conventional training methods based on local optimization.

Automatic Lipreading Using Color Lip Images and Principal Component Analysis (컬러 입술영상과 주성분분석을 이용한 자동 독순)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.15B no.3
    • /
    • pp.229-236
    • /
    • 2008
  • This paper examines effectiveness of using color images instead of grayscale ones for automatic lipreading. First, we show the effect of color information for performance of humans' lipreading. Then, we compare the performance of automatic lipreading using features obtained by applying principal component analysis to grayscale and color images. From the experiments for various color representations, it is shown that color information is useful for improving performance of automatic lipreading; the best performance is obtained by using the RGB color components, where the average relative error reductions for clean and noisy conditions are 4.7% and 13.0%, respectively.

A New Temporal Filtering Method for Improved Automatic Lipreading (향상된 자동 독순을 위한 새로운 시간영역 필터링 기법)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.15B no.2
    • /
    • pp.123-130
    • /
    • 2008
  • Automatic lipreading is to recognize speech by observing the movement of a speaker's lips. It has received attention recently as a method of complementing performance degradation of acoustic speech recognition in acoustically noisy environments. One of the important issues in automatic lipreading is to define and extract salient features from the recorded images. In this paper, we propose a feature extraction method by using a new filtering technique for obtaining improved recognition performance. The proposed method eliminates frequency components which are too slow or too fast compared to the relevant speech information by applying a band-pass filter to the temporal trajectory of each pixel in the images containing the lip region and, then, features are extracted by principal component analysis. We show that the proposed method produces improved performance in both clean and visually noisy conditions via speaker-independent recognition experiments.

Automatic Lipreading Based on Image Transform and HMM (이미지 변환과 HMM에 기반한 자동 립리딩)

  • 김진범;김진영
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.585-588
    • /
    • 1999
  • This paper concentrates on an experimental results on visual only recognition tasks using an image transform approach and HMM based recognition system. There are two approaches for extracting features of lipreading, a lip contour based approach and an image transform based one. The latter obtains a compressed representation of the image pixel values that contain the speaker's mouth results in superior lipreading performance. In addition, PCA(Principal component analysis) is used for fast algorithm. Finally, HMM recognition tasks are compared with the another.

  • PDF

Improved Automatic Lipreading by Multiobjective Optimization of Hidden Markov Models (은닉 마르코프 모델의 다목적함수 최적화를 통한 자동 독순의 성능 향상)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.53-60
    • /
    • 2008
  • This paper proposes a new multiobjective optimization method for discriminative training of hidden Markov models (HMMs) used as the recognizer for automatic lipreading. While the conventional Baum-Welch algorithm for training HMMs aims at maximizing the probability of the data of a class from the corresponding HMM, we define a new training criterion composed of two minimization objectives and develop a global optimization method of the criterion based on simulated annealing. The result of a speaker-dependent recognition experiment shows that the proposed method improves performance by the relative error reduction rate of about 8% in comparison to the Baum-Welch algorithm.

An Efficient Lipreading Method Based on Lip's Symmetry (입술의 대칭성에 기반한 효율적인 립리딩 방법)

  • Kim, Jin-Bum;Kim, Jin-Young
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.37 no.5
    • /
    • pp.105-114
    • /
    • 2000
  • In this paper, we concentrate on an efficient method to decrease a lot of pixel data to be processed with an Image transform based automatic lipreading It is reported that the image transform based approach, which obtains a compressed representation of the speaker's mouth, results in superior lipreading performance than the lip contour based approach But this approach produces so many feature parameters of the lip that has much data and requires much computation time for recognition To reduce the data to be computed, we propose a simple method folding at the vertical center of the lip-image based on the symmetry of the lip In addition, the principal component analysis(PCA) is used for fast algorithm and HMM word recognition results are reported The proposed method reduces the number of the feature parameters at $22{\sim}47%$ and improves hidden Markov model(HMM)word recognition rates at $2{\sim}3%$, using the folded lip-image compared with the normal method using $16{\times}16$ lip-image.

  • PDF

Design & Implementation of Speechreading System using the Face Feature on the Korean 8 Vowels (얼굴 특징점을 이용한 한국어 8모음 독화 시스템 구축)

  • Kim, Sun-Ok;Lee, Kyong-Ho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2009.01a
    • /
    • pp.135-140
    • /
    • 2009
  • 본 논문은 한국어 8 단모음을 인식하는 자동 독화 신경망 시스템을 구축한 것이다. 얼굴의 특정들은 휘도와 채도 성분으로 인하여 다양한 색 공간에서 다양한 표현 값을 갖는다. 이를 이용하여 각 표현 값들을 증폭하거나 축소, 대비시킴으로서 얼굴 특정들을 추출되게 하였다. 눈과 코, 안쪽 입의 외곽선, 이의 외곽선을 찾았고, 그 후 한국어 8모음 발화시 구분되게 변화는 값들을 파라미터로 설정하였다. 한국어 8모음을 발화하는 2400개의 자료를 모아 분석하고 이 분석을 바탕으로 신경망 시스템을 구축하여 실험하였다. 이 실험에 정상인 5명이 동원되었고, 사람들 사이에 있는 관찰 오차를 정규화를 통하여 수정하였다. 5명으로 분석하였고, 5명으로 인식 실험하여 좋은 결과를 얻었다.

  • PDF

A Study on Speechreading about the Korean 8 Vowels (한국어 8모음 자동 독화에 관한 연구)

  • Lee, Kyong-Ho;Yang, Ryong;Kim, Sun-Ok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.3
    • /
    • pp.173-182
    • /
    • 2009
  • In this paper, we studied about the extraction of the parameter and implementation of speechreading system to recognize the Korean 8 vowel. Face features are detected by amplifying, reducing the image value and making a comparison between the image value which is represented for various value in various color space. The eyes position, the nose position, the inner boundary of lip, the outer boundary of upper lip and the outer line of the tooth is found to the feature and using the analysis the area of inner lip, the hight and width of inner lip, the outer line length of the tooth rate about a inner mouth area and the distance between the nose and outer boundary of upper lip are used for the parameter. 2400 data are gathered and analyzed. Based on this analysis, the neural net is constructed and the recognition experiments are performed. In the experiment, 5 normal persons were sampled. The observational error between samples was corrected using normalization method. The experiment show very encouraging result about the usefulness of the parameter.

A Study on Analysis of Variant Factors of Recognition Performance for Lip-reading at Dynamic Environment (동적 환경에서의 립리딩 인식성능저하 요인분석에 대한 연구)

  • 신도성;김진영;이주헌
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.5
    • /
    • pp.471-477
    • /
    • 2002
  • Recently, lip-reading has been studied actively as an auxiliary method of automatic speech recognition(ASR) in noisy environments. However, almost of research results were obtained based on the database constructed in indoor condition. So, we dont know how developed lip-reading algorithms are robust to dynamic variation of image. Currently we have developed a lip-reading system based on image-transform based algorithm. This system recognize 22 words and this word recognizer achieves word recognition of up to 53.54%. In this paper we present how stable the lip-reading system is in environmental variance and what the main variant factors are about dropping off in word-recognition performance. For studying lip-reading robustness we consider spatial valiance (translation, rotation, scaling) and illumination variance. Two kinds of test data are used. One Is the simulated lip image database and the other is real dynamic database captured in car environment. As a result of our experiment, we show that the spatial variance is one of degradations factors of lip reading performance. But the most important factor of degradation is not the spatial variance. The illumination variances make severe reduction of recognition rates as much as 70%. In conclusion, robust lip reading algorithms against illumination variances should be developed for using lip reading as a complementary method of ASR.