• Title/Summary/Keyword: Perceptual signal analysis

Search Result 21, Processing Time 0.022 seconds

Perturbation and Perceptual Analysis of Pathological Sustained Vowels according to Signal Typing

  • Lee, Ji-Yeoun;Choi, Seong-Hee;Jiang, Jack J.;Hahn, Min-Soo;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.109-115
    • /
    • 2010
  • In this paper, we investigate a signal typing on the basis of visual impression of distinctive spectrogram. Pathological voices are classified into signal type 1, 2, 3, or 4 to estimate perturbation parameters and to mark perceptual rating based on Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). The results suggest that perturbation analysis can be applied to only type 1 and 2 signals and the perceptual ratings of overall grade increase with each signal type, overall. A good inter-rater reliability is showed among three raters. We recommend that pathological voices should be marked the signal typing and CAPE-V, together, to definitely describe the characteristics of pathological voices.

  • PDF

Signal Quality Enhancement using Perceptual Convolutional Noise Suppression (지각형 컨벌루션 잡음 제어를 통한 음질 개선 방법)

  • 김헌중;한헌수;홍민철;차형태
    • Journal of Broadcast Engineering
    • /
    • v.8 no.1
    • /
    • pp.11-18
    • /
    • 2003
  • In this paper, we introduce a novel signal quality enhancement algorithm with a perceptual interference analysis and perceptual convolutional noise suppression. A perceptual convolutional noise is reflected in the audible disturbance that can still be recognized after the additional noise suppression and tonality change which is caused by the noise energy excitation. The enhancement system is organized with a perceptual additional noise suppression part and a perceptual convolutional noise suppression part. Experimental results show that these two parts have an equivalent quality enhancement performance.

The Utility of Perturbation, Non-linear dynamic, and Cepstrum measures of dysphonia according to Signal Typing (음성 신호 분류에 따른 장애 음성의 변동률 분석, 비선형 동적 분석, 캡스트럼 분석의 유용성)

  • Choi, Seong Hee;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.63-72
    • /
    • 2014
  • The current study assessed the utility of acoustic analyses the most commonly used in routine clinical voice assessment including perturbation, nonlinear dynamic analysis, and Spectral/Cepstrum analysis based on signal typing of dysphonic voices and investigated their applicability of clinical acoustic analysis methods. A total of 70 dysphonic voice samples were classified with signal typing using narrowband spectrogram. Traditional parameters of %jitter, %shimmer, and signal-to-noise ratio were calculated for the signals using TF32 and correlation dimension(D2) of nonlinear dynamic parameter and spectral/cepstral measures including mean CPP, CPP_sd, CPPf0, CPPf0_sd, L/H ratio, and L/H ratio_sd were also calculated with ADSV(Analysis of Dysphonia in Speech and VoiceTM). Auditory perceptual analysis was performed by two blinded speech-language pathologists with GRBAS. The results showed that nearly periodic Type 1 signals were all functional dysphonia and Type 4 signals were comprised of neurogenic and organic voice disorders. Only Type 1 voice signals were reliable for perturbation analysis in this study. Significant signal typing-related differences were found in all acoustic and auditory-perceptual measures. SNR, CPP, L/H ratio values for Type 4 were significantly lower than those of other voice signals and significant higher %jitter, %shimmer were observed in Type 4 voice signals(p<.001). Additionally, with increase of signal type, D2 values significantly increased and more complex and nonlinear patterns were represented. Nevertheless, voice signals with highly noise component associated with breathiness were not able to obtain D2. In particular, CPP, was highly sensitive with voice quality 'G', 'R', 'B' than any other acoustic measures. Thus, Spectral and cepstral analyses may be applied for more severe dysphonic voices such as Type 4 signals and CPP can be more accurate and predictive acoustic marker in measuring voice quality and severity in dysphonia.

Performance Improvement of Speech Enhancement Using Independent Component Analysis and Perceptual Filtering (독립 성분 분석과 지각 필터를 이용한 음질 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.4
    • /
    • pp.270-277
    • /
    • 2010
  • In this paper, we proposed an algorithm that improves tone quality of noisy audio signals by using ICA(Independent Component Analysis) algorithm and perceptual filters. Many algorithms have been proposed to eliminate the noise from the audio signals, such as spectral subtraction method, perceptual filter, etc. The perceptual filter uses a noise that is acquired from silent ranges in the input signal. In this case, the improvement rate of tone quality decreases if the noise energy is changed by the environmental variation in a signal frame. But the proposed method estimates a noise that is changed at each frame using ICA algorithm. The estimated noise is applied to perceptual filter. To show the performance of the proposed algorithm, several tests are performed to various input signals. With the proposed algorithm, we could confirm the enhancement of tone quality in terms of segmental SNR (SSNR), noise-to-mask ratio (NMR) and Degradation Category Rating (DCR) test.

A Novel Multi-Channel Hearing Aid Algorithm with SMR(signal-to-masking ratio) Improvement (신호 대 마스킹 비 개선을 통한 다채널 보청 알고리즘)

  • 김헌중;홍민철;차형태
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.8
    • /
    • pp.12-21
    • /
    • 2000
  • In this paper, we propose a novel hearing aid algorithm for sensorinural hearing loss restoration with multi-channel(band) dynamic range compression and psychoacoustics. In this way, we can present a normal perception condition to the impaired listener. The proposed algorithm make loudness scaling function achieve proper loudness level, and analysis masking property for the signal will be perceived to impaired listener, and then, restore normal spectral contrast using SMR(signal-to-masking ratio) defined by distance between the level of each frequency and masking threshold.

  • PDF

Robust Speech Enhancement Based on Soft Decision Employing Spectral Deviation (스펙트럼 변이를 이용한 Soft Decision 기반의 음성향상 기법)

  • Choi, Jae-Hun;Chang, Joon-Hyuk;Kim, Nam-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.222-228
    • /
    • 2010
  • In this paper, we propose a new approach to noise estimation incorporating spectral deviation with soft decision scheme to enhance the intelligibility of the degraded speech signal in non-stationary noisy environments. Since the conventional noise estimation technique based on soft decision scheme estimates and updates the noise power spectrum using a fixed smoothing parameter which was assumed in stationary noisy environments, it is difficult to obtain the robust estimates of noise power spectrum in non-stationary noisy environments that spectral characteristics of noise signal such as restaurant constantly change. In this paper, once we first classify the stationary noise and non-stationary noise environments based on the analysis of spectral deviation of noise signal, we adaptively estimate and update the noise power spectrum according to the classified noise types. The performances of the proposed algorithm are evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various ambient noise environments and show better performances compared with the conventional method.

Time-Scale Modification of Polyphonic Audio Signals Using Sinusoidal Modeling (정현파 모델링을 이용한 폴리포닉 오디오 신호의 시간축 변화)

  • 장호근;박주성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.77-85
    • /
    • 2001
  • This paper proposes a method of time-scale modification of polyphonic audio signals based on a sinusoidal model. The signals are modeled with sinusoidal component and noise component. A multiresolution filter bank is designed which splits the input signal into six octave-spaced subbands without aliasing and sinusoidal modeling is applied to each subband signal. To alleviate smearing of transients in time-scale modification a dynamic segmentation method is applied to subbands which determines the analysis-synthesis frame size adaptively to fit time-frequency characteristics of the subband signal. For extracting sinusoidal components and calculating their parameters matching pursuit algorithm is applied to each analysis frame of subband signal. In accordance with spectrum analysis a psychoacoustic model implementing the effect of frequency masking is incorporated with matching pursuit to provide a resonable stop condition of iteration and reduce the number of sinusoids. The noise component obtained by subtracting the synthesized signal with sinusoidal components from the original signal is modeled by line-segment model of short time spectrum envelope. For various polyphonic audio signals the result of simulation shows suggested sinusoidal modeling can synthesize original signal without loss of perceptual quality and do more robust and high quality time-scale modification for large scale factor because of representing transients without any perceptual loss.

  • PDF

Prediction of Efficient Adaptive Perceptual Filter Iterate Coefficient through Analysis of Noisy Signal (잡음에 열화된 오디오 신호의 분석을 통한 효율적인 적응지각필터 반복 수행 계수의 예측)

  • Ryu, Il-Hyun;Cha, Hyung-Tai;Koo, Kyo-Sik;Seo, Bo-Kook
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2005.11a
    • /
    • pp.238-241
    • /
    • 2005
  • 디지털 미디어 기술의 발전은 코딩 분야를 비롯하여 다양하게 발전하고 있다. 특히 오디오 신호 처리 분야에서는 디지털 오디오 신호의 생성, 압축, 복원의 단계가 다양한 형태로 개발되고 있다. 오디오 신호 처리에서 인간의 청각 기관을 모델링한 심리음향 기법은 이용하여 압축뿐만 아니라 잡음 신호의 개선에서도 효과적으로 이용되고 있다. 이러한 심리음향모델을 기반으로 하여 구성된 적응지각필터는 지각필터를 이용하여 적응적으로 잡음에 열화된 신호를 개선한다. 이때, 적응지각필터 반복 수행 계수의 효과적인 결절은 오디오 신호의 청각적 손실을 줄이는 동시에 정확한 잡음 제거를 수행한다. 성능을 확인하기 위해서 SNR 및 NMR 비교를 수행하였다.

  • PDF

On the Perceptually Important Phase Information in Acoustic Signal (인지에 중요한 음향신호의 위상에 대해)

    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.7
    • /
    • pp.28-33
    • /
    • 2000
  • For efficient quantization of speech representation, it is common to incorporate Perceptual characteristics of human hearing. However, the focus has been confined only to the magnitude information of speech, and little attention has been paid to phase information. This paper presents a novel approach, termed perceptually irrelevant phase elimination (PIPE), to find out irrelevant phase information of acoustic signals in terms of perception. The proposed method, which is based on the observation that the relative phase relationship within a critical band is perceptually important, is derived not only for stationary Fourier signal but also for harmonic signal. The proposed method is incorporated into the analysis/synthesis system based on harmonic representation of speech, and subjective test results demonstrate the effectiveness of proposed method.

  • PDF

Speech Quality Measure for VoIP Using Wavelet Based Bark Coherence Function (웨이블렛 기반 바크 코히어런스 함수를 이용한 VoIP 음질평가)

  • 박상욱;박영철;윤대희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.4A
    • /
    • pp.310-315
    • /
    • 2002
  • The Bark Coherence Function (BCF) defies a coherence function within perceptual domain as a new cognition module, robust to linear distortions due to the analog interface of digital mobile system. Our previous experiments have shown the superiority of BCF over current measures. In this paper, a new BCF suitable for VoIP is developed. The unproved BCF is based on the wavelet series expansion that provides good frequency resolution while keeping good time locality. The proposed Wavelet based Bark Coherence function (WBCF) is robust to variable delay often observed in packet-based telephony such as Voice over Internet Protocol (VoIP). We also show that the refinement of time synchronization after signal decomposition can improve the performance of the WBCF. The regression analysis was performed with VoIP speech data. The correlation coefficients and the standard error of estimates computed using the WBCF showed noticeable improvement over the Perceptual Speech Quality Measure (PSQM) that is recommended by ITU-T.