• Title/Summary/Keyword: Blind segmentation

Search Result 12, Processing Time 0.022 seconds

Performance improvement of text-dependent speaker verification system using blind speech segmentation and energy weight (Blind speech segmentation과 에너지 가중치를 이용한 문장 종속형 화자인식기의 성능 향상)

  • Kim Jung-Gon;Kim Hyung Soon
    • MALSORI
    • /
    • no.47
    • /
    • pp.131-140
    • /
    • 2003
  • We propose a new method of generating client models for HMM based text-dependent speaker verification system with only a small amount of training data. To make a client model, statistical methods such as segmental K-means algorithm are widely used, but they do not guarantee the quality or reliability of a model when only limited data are avaliable. In this paper, we propose a blind speech segmentation based on level building DTW algorithm as an alternative method to make a client model with limited data. In addition, considering the fact that voiced sounds have much more speaker-specific information than unvoiced sounds and energy of the former is higher than that of the latter, we also propose a new score evaluation method using the observation probability raised to the power of weighting factor estimated from the normalized log energy. Our experiment shows that the proposed methods are superior to conventional HMM based speaker verification system.

  • PDF

Utilization of Syllabic Nuclei Location in Korean Speech Segmentation into Phonemic Units (음절핵의 위치정보를 이용한 우리말의 음소경계 추출)

  • 신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.13-19
    • /
    • 2000
  • The blind segmentation method, which segments input speech data into recognition unit without any prior knowledge, plays an important role in continuous speech recognition system and corpus generation. As no prior knowledge is required, this method is rather simple to implement, but in general, it suffers from bad performance when compared to the knowledge-based segmentation method. In this paper, we introduce a method to improve the performance of a blind segmentation of Korean continuous speech by postprocessing the segment boundaries obtained from the blind segmentation. In the preprocessing stage, the candidate boundaries are extracted by a clustering technique based on the GLR(generalized likelihood ratio) distance measure. In the postprocessing stage, the final phoneme boundaries are selected from the candidates by utilizing a simple a priori knowledge on the syllabic structure of Korean, i.e., the maximum number of phonemes between any consecutive nuclei is limited. The experimental result was rather promising : the proposed method yields 25% reduction of insertion error rate compared that of the blind segmentation alone.

  • PDF

Segmentation of continuous Korean Speech Based on Boundaries of Voiced and Unvoiced Sounds (유성음과 무성음의 경계를 이용한 연속 음성의 세그먼테이션)

  • Yu, Gang-Ju;Sin, Uk-Geun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2246-2253
    • /
    • 2000
  • In this paper, we show that one can enhance the performance of blind segmentation of phoneme boundaries by adopting the knowledge of Korean syllabic structure and the regions of voiced/unvoiced sounds. eh proposed method consists of three processes : the process to extract candidate phoneme boundaries, the process to detect boundaries of voiced/unvoiced sounds, and the process to select final phoneme boundaries. The candidate phoneme boudaries are extracted by clustering method based on similarity between two adjacent clusters. The employed similarity measure in this a process is the ratio of the probability density of adjacent clusters. To detect he boundaries of voiced/unvoiced sounds, we first compute the power density spectrum of speech signal in 0∼400 Hz frequency band. Then the points where this paper density spectrum variation is greater than the threshold are chosen as the boundaries of voiced/unvoiced sounds. The final phoneme boundaries consist of all the candidate phoneme boundaries in voiced region and limited number of candidate phoneme boundaries in unvoiced region. The experimental result showed about 40% decrease of insertion rate compared to the blind segmentation method we adopted.

  • PDF

Restoration of Bi-level Images via Iterative Semi-blind Wiener Filtering (반복 semi-blind 위너 필터링을 이용한 이진영상의 복원)

  • Kim, Jeong-Tae
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.57 no.7
    • /
    • pp.1290-1294
    • /
    • 2008
  • We present a novel deblurring algorithm for bi-level images blurred by some parameterizable point spread function. The proposed method iteratively searches unknown parameters in the point spread function and noise-to-signal ratio by minimizing an objective function that is based on the binariness and the difference between two intensity values of restoring image. In simulations and experiments, the proposed method showed improved performance compared with the Wiener filtering based method in terms of bit error rate after segmentation.

A Side-and Rear-View Image Registration System for Eliminating Blind Spots (차량의 사각 지대 제거를 위한 측/후방 카메라 영상 정합 시스템)

  • Park, Min-Woo;Jang, Kyung-Ho;Jung, Soon Ki;Yoon, Pal-Joo
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.8
    • /
    • pp.653-663
    • /
    • 2009
  • In this paper, we propose a blind spots elimination system using three cameras. A wide-angle camera is attached on trunk for eliminating blind spots of a rear-view mirror and two cameras are attached on each side-view mirror for eliminating blind spots of vehicle's sides. In order to eliminate blind spots efficiently, we suggest a method to build a panoramic mosaic view with two side images and one wide-angle rear image. First, we obtain an undistorted image from a wide-angle camera of rear-view and calculate the focus-of-contraction (FOC) in undistorted images of rear-view while the car is moving straight forward. Second, we compute a homography among side-view images and an undistorted image of rear-view in flat road scenes. Next, we perform an image registration process after road and background region segmentation. Finally, we generate various views such as a cylinder panorama view, a top view and an information panoramic mosaic view.

A Blind Segmentation Algorithm for Speaker Verification System (화자확인 시스템을 위한 분절 알고리즘)

  • 김지운;김유진;민홍기;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.45-50
    • /
    • 2000
  • This paper proposes a delta energy method based on Parameter Filtering(PF), which is a speech segmentation algorithm for text dependent speaker verification system over telephone line. Our parametric filter bank adopts a variable bandwidth along with a fixed center frequency. Comparing with other methods, the proposed method turns out very robust to channel noise and background noise. Using this method, we segment an utterance into consecutive subword units, and make models using each subword nit. In terms of EER, the speaker verification system based on whole word model represents 6.1%, whereas the speaker verification system based on subword model represents 4.0%, improving about 2% in EER.

  • PDF

Text-dependent Speaker Verification System Over Telephone Lines (전화망을 위한 어구 종속 화자 확인 시스템)

  • 김유진;정재호
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.663-667
    • /
    • 1999
  • In this paper, we review the conventional speaker verification algorithm and present the text-dependent speaker verification system for application over telephone lines and its result of experiments. We apply blind-segmentation algorithm which segments speech into sub-word unit without linguistic information to the speaker verification system for training speaker model effectively with limited enrollment data. And the World-mode] that is created from PBW DB for score normalization is used. The experiments are presented in implemented system using database, which were constructed to simulate field test, and are shown 3.3% EER.

  • PDF

Segmentation of Continuous Speech based on PCA of Feature Vectors (주요고유성분분석을 이용한 연속음성의 세그멘테이션)

  • 신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.2
    • /
    • pp.40-45
    • /
    • 2000
  • In speech corpus generation and speech recognition, it is sometimes needed to segment the input speech data without any prior knowledge. A method to accomplish this kind of segmentation, often called as blind segmentation, or acoustic segmentation, is to find boundaries which minimize the Euclidean distances among the feature vectors of each segments. However, the use of this metric alone is prone to errors because of the fluctuations or variations of the feature vectors within a segment. In this paper, we introduce the principal component analysis method to take the trend of feature vectors into consideration, so that the proposed distance measure be the distance between feature vectors and their projected points on the principal components. The proposed distance measure is applied in the LBDP(level building dynamic programming) algorithm for an experimentation of continuous speech segmentation. The result was rather promising, resulting in 3-6% reduction in deletion rate compared to the pure Euclidean measure.

  • PDF

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

  • Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.697-705
    • /
    • 2009
  • An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.

An Image Segmentation Method for Richardson-Lucy Deconvolution Algorithm Improvement (영상 분할을 통한 Richardson-Lucy 디컨벌루션 개선 알고리듬)

  • Kim, Jeonghwan;Park, Daejun;Jeon, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.11a
    • /
    • pp.114-117
    • /
    • 2015
  • 본 논문에서는 Non-blind 디컨벌루션 알고리듬 중 하나인 Richardson-Lucy(RL) 디컨벌루션을 영상 분할을 통해 성능을 향상시킨 알고리듬을 제안한다. RL 디컨벌루션은 영상의 크기가 커질수록 연산 양이 크게 증가한다. 따라서 크기가 큰 영상의 RL 디컨벌루션은 계산에 많은 시간을 필요로 한다. 이를 개선하기 위하여 영상을 적절한 크기로 분할하여 각각 RL 디컨벌루션을 계산한다. 또한 분할 시 생기는 왜곡을 줄이기 위해 리플 제거를 위한 알고리듬을 추가한다. 이를 통해 기존의 알고리듬보다 연산 양을 줄여 빠른 RL 디컨벌루션이 가능하도록 개선한다.

  • PDF