• Title/Summary/Keyword: speech signal

Search Result 1,171, Processing Time 0.03 seconds

Performance Enhancement of Speech Intelligibility in Communication System Using Combined Beamforming (directional microphone) and Speech Filtering Method (방향성 마이크로폰과 음성 필터링을 이용한 통신 시스템의 음성 인지도 향상)

  • Shin, Min-Cheol;Wang, Se-Myung
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2005.05a
    • /
    • pp.334-337
    • /
    • 2005
  • The speech intelligibility is one of the most important factors in communication system. The speech intelligibility is related with speech to noise ratio. To enhance the speech to noise ratio, background noise reduction techniques are being developed. As a part of solution to noise reduction, this paper introduces directional microphone using beamforming method and speech filtering method. The directional microphone narrows the spatial range of processing signal into the direction of the target speech signal. The noise signal located in the same direction with speech still remains in the processing signal. To sort this mixed signal into speech and noise, as a following step, a speech-filtering method is applied to pick up only the speech signal from the processed signal. The speech filtering method is based on the characteristics of speech signal itself. The combined directional microphone and speech filtering method gives enhanced performance to speech intelligibility in communication system.

  • PDF

Decomposition of Speech Signal into AM-FM Components Using Varialle Bandwidth Filter (가변 대역폭 필터를 이용한 음성신호의 AM-FM 성분 분리에 관한 연구)

  • Song, Min;Lee, He-Young
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.45-58
    • /
    • 2001
  • Modulated components of a speech signal are frequently used for speech coding, speech recognition, and speech synthesis. Time-frequency representation (TFR) reveals some information about instantaneous frequency, instantaneous bandwidth and boundary of each component of the considering speech signal. In many cases, the extraction of AM-FM components corresponding to instantaneous frequencies is difficult since the Fourier spectra of the components with time-varying instantaneous frequency are overlapped each other in Fourier frequency domain. In this paper, an efficient method decomposing speech signal into AM-FM components is proposed. A variable bandwidth filter is developed for the decomposition of speech signals with time-varying instantaneous frequencies. The variable bandwidth filter can extract AM-FM components of a speech signal whose TFRs are not overlapped in timefrequency domain. Also, amplitude and instantaneous frequency of the decomposed components are estimated by using Hilbert transform.

  • PDF

Speech Enhancement Using Receding Horizon FIR Filtering

  • Kim, Pyung-Soo;Kwon, Wook-Hyu;Kwon, Oh-Kyu
    • Transactions on Control, Automation and Systems Engineering
    • /
    • v.2 no.1
    • /
    • pp.7-12
    • /
    • 2000
  • A new speech enhancement algorithm for speech corrupted by slowly varying additive colored noise is suggested based on a state-space signal model. Due to the FIR structure and the unimportance of long-term past information, the receding horizon (RH) FIR filter known to be a best linear unbiased estimation (BLUE) filter is utilized in order to obtain noise-suppressed speech signal. As a special case of the colored noise problem, the suggested approach is generalized to perform the single blind signal separation of two speech signals. It is shown that the exact speech signal is obtained when an incoming speech signal is noise-free.

  • PDF

A Study on the Pitch Detection of Speech Harmonics by the Peak-Fitting (음성 하모닉스 스펙트럼의 피크-피팅을 이용한 피치검출에 관한 연구)

  • Kim, Jong-Kuk;Jo, Wang-Rae;Bae, Myung-Jin
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.85-95
    • /
    • 2003
  • In speech signal processing, it is very important to detect the pitch exactly in speech recognition, synthesis and analysis. If we exactly pitch detect in speech signal, in the analysis, we can use the pitch to obtain properly the vocal tract parameter. It can be used to easily change or to maintain the naturalness and intelligibility of quality in speech synthesis and to eliminate the personality for speaker-independence in speech recognition. In this paper, we proposed a new pitch detection algorithm. First, positive center clipping is process by using the incline of speech in order to emphasize pitch period with a glottal component of removed vocal tract characteristic in time domain. And rough formant envelope is computed through peak-fitting spectrum of original speech signal infrequence domain. Using the roughed formant envelope, obtain the smoothed formant envelope through calculate the linear interpolation. As well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. Inverse fast fourier transform (IFFT) compute this flattened harmonics. After all, we obtain Residual signal which is removed vocal tract element. The performance was compared with LPC and Cepstrum, ACF. Owing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

Iterative Computation of Periodic and Aperiodic Part from Speech Signal (음성 신호로부터 주기, 비주기 성분의 반복적 계산법에 의한 분리 실험)

  • Jo Cheol-Woo;Lee Tao
    • MALSORI
    • /
    • no.48
    • /
    • pp.117-126
    • /
    • 2003
  • source of speech signal is actually composed of combination of periodic and aperiodic components, although it is often modeled to either one of those. In the paper an experiment which can separate periodic and aperiodic components from speech source. Linear predictive residual signal was used as a approximated vocal source the original speech to obtain the estimated aperiodic part. Iterative extrapolation method was used to compute the aperiodic part.

  • PDF

Raw Speech Based Digital Watermarking Using Zerotrees of DWT

  • Schwindt, Sataporn;Amornraksa, Thumrongrat
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.478-481
    • /
    • 2002
  • In this paper, the zerotrees of DWT is proposed to be used in a speech based digital watermarking for digital images. Since in this research work the raw speech and its content are used as a watermark signal, in the watermarking scheme, the PCM coded speech signal is embedded into a sequence of images. The performance of the scheme is evaluated by the PSNR obtained from the watermarked images and the strength of attacks the embedded speech signal can survive. Moreover, since in this research work the contents contained in the speech is used to identify the specific information hidden in the embedded signal. The speech signal after being extracted from the watermarked images is played back to the listeners to determine whether its content is intelligible or not. The experimental results show impressive performance of the scheme implementing our proposed technique, judged by the higher robustness obtained form the embedded signal against various types of attack, including brightness/contrast enhancement, Twirling, highpass filtering and JPEG compression standard.

  • PDF

A New Speech Enhancement Method Using Adaptive Digital Filter (적응디지털필터를 사용한 음질향상 방법)

  • 임용훈;김완구;차일환;윤대희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.10
    • /
    • pp.35-41
    • /
    • 1993
  • In this paper, a new speech enhancement method for speech signal corrupted by environmental noise is proposed. Two signals are obtained from the microphone and from the accelerometer attached to the neck, respectively. Since two signals are generated from same source signal, both signals are closely correlated. And environmental noise has no effect on the accelerometer signal. The speech enhancement system identifies the optimum linear system between two signals on the basis of the dependence between the signals. The enhanced speech can be obtained by filtering the noise-free accelerometer signal. Since the characteristcs of the speech signal and environmental noise are changing with time, adaptive filtering system has to be used for characterizing the time-varing system. Simulation results show 7dB enhancement with 0dB speech signal level relative to the white noise.

  • PDF

A Study on Speech Separation using Sinusoidal Model and Psycoacoustics Model (정현파 모델과 사이코어쿠스틱스 모델을 이용한 음성 분리에 관한 연구)

  • Hwang, Sun-Il;Han, Doo-Jin;Kwon, Chul-Hyun;Shin, Dae-Kyu;Park, Sang-Hui
    • Proceedings of the KIEE Conference
    • /
    • 2001.07d
    • /
    • pp.2622-2624
    • /
    • 2001
  • In this thesis, speaker separation is employed when speech from two talkers has been summed into one signal and it is desirable to recover one or both of the speech signals from the composite signal. This paper proposed the method that separated the summed speeches and proved the similarity between the signals by the cross correlation between the signals for exact between original signal and separated signal. This paper uses frequency sampling method based on sinusoidal model to separate the composite signal with vocalic speech and vocalic speech and noise masking method based on psycoacoustics model to separate the composite signal with vocalic speech and nonvocalic speech.

  • PDF

Detection of Glottal Closure Instant for Voiced Speech Using Wavelet Transform (웨이브렛 변환을 이용한 음성신호의 성문폐쇄시점 검출)

  • Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.153-165
    • /
    • 2000
  • During the phonation of voiced sounds, instants exist where the glottis is opened or closed, due to the periodic vibration of the vocal cord. When closed, this is called the glottal closure instant(GCI) or epoch.. The correct detection of the GCI is one of the important problems in speech processing for pitch detection, pitch synchronous analysis, and so on. Recently, it has been shown that the local maxima points of the wavelet transformed speech signal correspond to the GCIs of speech signal. In this paper, we investigate the accuracy of Gels estimated from this wavelet transformed speech signal. For this purpose we compare them with the negative peak points of the differentiated EGG signal that represents the actual GCIs of speech signal.

  • PDF