Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
The Journal of the Acoustical Society of Korea
Journal Basic Information
Journal DOI :
The Acoustical Society of Korea
Editor in Chief :
Volume & Issues
Volume 22, Issue 8 - Nov 2003
Volume 22, Issue 7 - Oct 2003
Volume 22, Issue 6 - Aug 2003
Volume 22, Issue 5 - Jul 2003
Volume 22, Issue 4 - May 2003
Volume 22, Issue 3 - Apr 2003
Volume 22, Issue 2 - Feb 2003
Volume 22, Issue 1 - Jan 2003
Volume 22, Issue 1E - 00 2003
Selecting the target year
Speech Recognition by Neural Net Pattern Recognition Equations with Self-organization
김성일 ; 정현열 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 49~49
The modified neural net pattern recognition equations were attempted to apply to speech recognition. The proposed method has a dynamic process of self-organization that has been proved to be successful in recognizing a depth perception in stereoscopic vision. This study has shown that the process has also been useful in recognizing human speech. In the processing, input vocal signals are first compared with standard models to measure similarities that are then given to a process of self-organization in neural net equations. The competitive and cooperative processes are conducted among neighboring input similarities, so that only one winner neuron is finally detected. In a comparative study, it showed that the proposed neural networks outperformed the conventional HMM speech recognizer under the same conditions.
Standardized Noise Annoyance Modifiers in Korean According to the ICBEN Method
전진용 ; 김경호 ; T. Yano ;
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 56~56
Recently a number of social surveys on community response to environmental noises have been conducted to summarize response relationship obtained from different areas. Some problems have been pointed out in comparing the result of surveys using verbal scales with different number of categories. ICBEN (International Commission on Biological Environment of Noise) Team 6 planned a international joint study and constructed comparable standardized noise annoyance scales using the same method. In Korea the survey was conducted in four areas such as Seoul, Taejon, Taegu, Kwangju. About 100 subjects participated in each area approximately. Finally five verbal annoyance were constructed as follows; 1 (Jeonhyu), 2 (Jokm), 3 (Bikyojerk), 4 (Ajoo), 5 (Umcheongnage).g
Spectral Subtraction Using Spectral Harmonics for Robust Speech Recognition in Car Environments
백정훈 ; 고한석 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 62~62
This paper addresses a novel noise-compensation scheme to solve the mismatch problem between training and testing condition for the automatic speech recognition (ASR) system, specifically in car environment. The conventional spectral subtraction schemes rely on the signal-to-noise ratio (SNR) such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, these schemes are based on the postulation that the power spectrum of noise is in general at the lower level in magnitude than that of speech. Therefore, while such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as that of car environment. This paper proposes an efficient spectral subtraction scheme focused specifically to low SNR noisy environment by extracting harmonics distinctively in speech spectrum. Representative experiments confirm the superior performance of the proposed method over conventional methods. The experiments are conducted using car noise-corrupted utterances of Aurora2 corpus.
Correlations Among Speed of Sound, Broadband Ultrasonic Attenuation, Broadband Ultrasonic Reflection, and Bone Density in Bovine Cancellous Bone
이강일 ; 최복경 ; 윤석왕 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 69~69
Correlations between acoustic properties and bone density have been investigated in bovine cancellous bone. Speed of sound (SOS), broadband ultrasonic attenuation (BUA), and broadband ultrasonic reflection (BUR) were measured in 10 defatted bovine cancellous bone specimens in vitro. SOS showed a significant correlation with the apparent density of the bone. A comparable correlation was observed between BUA and the apparent density. BUR was rather highly correlated with the apparent density. It was shown that BUR had a weak correlation with BUA and a significant correlation with SOS. This indicates that the parameter BUR can provide important information that may not be contained in BUA and SOS and, therefore, can be useful as an alternative diagnostic parameter of osteoporosis. As expected, a linear combination of all three ultrasonic parameters in a multiple regression model resulted in a significant improvement in predicting the apparent bone density.
Measurement of Horizontal Coherence Using a Line Array In Shallow Water
박정수 ; 김성일 ; 나영남 ; 김영규 ; 오택환 ; 나정열 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 78~78
We analyzed the measured acoustic field to explore the characteristics of a horizontal coherence in shallow water. Signal spatial coherence data were obtained in the continental shelf off the east coast of Korea using a horizontal line array. The array was deployed on the bottom of 130 m water depth and a sound source was towed at 26 m depth in the source-receiver ranges of 1-13 ㎞. The source transmitted 200 ㎐ pure tone. Topography and temperature profiles along the source track were measured to investigate the relationship between the horizontal coherence and environment variations. The beam bearing disturbance and array signal gain degradation is examined as parameters of horizontal coherence. The results show that the bearing disturbance is about ± 8° and seems to be affected by temporal variations of temperature caused by internal waves. The array signal gains show degradation more than 5㏈ by the temporal and spatial variations of temperature and by the down-sloped topography.
Speech Enhancement Using Level Adapted Wavelet Packet with Adaptive Noise Estimation
장성욱 ; 권영헌 ; 정성일 ; 양성일 ; 이건상 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 87~87
In this paper, a new speech enhancement method using level adapted wavelet packet is presented. First, we propose a level adapted wavelet packet to alleviate a drawback of the conventional node adapted one in noisy environment. Next, we suggest an adaptive noise estimation method at each node on level adapted wavelet packet tree. Then, for more accurate noise component subtraction, we propose a new estimation method of spectral subtraction weight. Finally, we present a modified spectral subtraction method. The proposed method is evaluated on various noise conditions: speech babble noise, F-l6 cockpit noise, factory noise, pink noise, and Volvo car interior noise. For an objective evaluation, the SNR test was performed. Also, spectrogram test and a very simple listening test as a subjective evaluation were performed.
Rating Floor Impact Noise in Apartment Buildings Through Subjective Evaluation Tests
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 88~95
The auditory experiments based on subjective responses were undertaken for the standard heavy and light weight impact noise and rubber ball impact noise, jumping noise to investigate relations between floor Impact noise levels and subjective responses and to establish the upper/lower limits of floor impact noises. As a result, it was shown that relations between floor Impact noise levels and subjective responses was linear and the lower limit of heavy-weight impact noise was L/sub i, Fmax, AW/=46㏈ and the lower limit of light-weight impact noise was L'/sub n,AW/=56㏈. Finally the 3 subjective classes of floor impact noises were established.
Sinusoidal Modeling of Audio Signals Using Perceptually Weighted Matching Pursuit
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 96~103
This paper describes a method for sinusoidal modeling of audio signals using perceptually weighted matching pursuit. Matching pursuits extracts iteratively the greatest energy signals from the input signals until the residual between the original and the reconstructed signal is zero. In this paper, perceptual matching pursuits using psychoacoustic model to matching pursuit extracts greatest perceived energy iteratively. To evaluate the performance of the perceptual matching pursuits it is compared with the sinusoidal matching pursuits which is not included perceptual weighting. For various audio signals the result of simulation shows that the perceptual matching pursuit is superior to the sinusoidal matching pursuits, especially for a high change rate in time domain it can synthesized original signal.
An Analysis of the Acoustical Source Characteristics in the Time-varying Fluid Machines
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 104~112
The in-duct acoustical sources of fluid machines are often characterized by the source impedance and strength using the linear time-invariant model. However, negative resistances, which are physically unreasonable, have been found throughout various measurements of the source properties in IC-engines and compressors. In this paper, the effects of the time-varying nature of fluid machines on the source characteristics are studied analytically. For this purpose, the simple fluid machine consisting of a reciprocating piston and an exhaust is considered as representing a typical periodic, time-varying system and the equivalent circuits are analyzed. Simulated measurements using the analytic solutions show that the time-varying nature in the actual sources is one of the main causes of the negative source resistances. It is also found that, for the small magnitude of the time-varying component, the source radiates large acoustic power if the piston operates at twice the natural frequency of the static system. or integral submultiples of that rate.
Accomplishments of Rayleigh's Experimental Research: Improvement of Instruments and Enhancement of Precision
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 113~120
Rayleigh was an excellent experimenter as well as a theorist. Rayleigh improved Rijke's sounding device by heat and the singing flame into sources of pure tones. Above all, his making of the artificial bird whistle was a critical achievement in the improvement of experimental sound sources. This source made supersonic waves available in the laboratory and thus paved the way to confirmable observations of reflection, refraction, diffraction and interference of sound in the laboratory Furthermore, Rayleigh augmented the sensitivity of sensitive flames as detectors for sound wave. Besides, he devised a phonic wheel which could precisely control the angular velocity of some acoustical instruments and made the Rayleigh-disk that enabled experimenters to measure the absolute value of the sound intensity. These devices enhanced the exactness of acoustical experiments.
A Study on the Robust Double Talk Detector for Acoustic Echo Cancellation System
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 121~128
Acoustic Echo Cancellation(m) is very active research topic having many applications like teleconference and hands-free communication and it employs Double Talk Detector(DTD) to indicate whether the near-end speaker is active or not. However. the DTD is very sensitive to the variation of acoustical environment and it sometimes provides wrong information about the near-end speaker. In this paper, we are focusing on the development of robust DTD algorithm which is a basic building block for reliable AEC system. The proposed AEC system consists of delayless subband AEC and narrow-band DTD. Delayless subband AEC has proven to have excellent performance of echo cancellation with a low complexity and high convergence speed. In addition, it solves the signal delay problem in the existing subband AEC. On the other hand, the proposed narrowband DTD is operating on low frequency subband. It can take most advantages from the narrow subband such as a low computational complexity due to the down-sampling and the reliable DTD decision making procedure because of the low-frequency nature of the subband signal. From the simulation results of the proposed narrowband DTD and wideband DTD, we confirm that the proposed DTD outperforms the wideband DTD in a sense of removing possible false decision making about the near-end speaker activity.
An Efficient Algebraic Codebook Search Method for ham Speech Coder
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 129~134
In this paper, we efficiently implement the AMR speech coder by reducing the complexity of algebraic codebook search. To reduce the computational complexity of the algebraic codebook search, we propose a fast algebraic codebook search method that improves conventional depth first tree search method used in AMR speech coder algorithm. The proposed method reduces the search complexity by pruning the trees which are less possible to be selected as an optimum excitation. This method needs no additional computation for selecting the trees to be pruned and reduces the computational complexity considerably compared to the original depth first tree search method with slightly degradation or speech qualify. Applying our method to the implementation or AMR speech coder with 12.2 kbps mode by using the TeakLite DSP, we reduce the search complexity about 40% compared to the conventional method.
Context-adaptive Phoneme Segmentation for a TTS Database
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 135~144
A method for the automatic segmentation of speech signals is described. The method is dedicated to the construction of a large database for a Text-To-Speech (TTS) synthesis system. The main issue of the work involves the refinement of an initial estimation of phone boundaries which are provided by an alignment, based on a Hidden Market Model(HMM). Multi-layer perceptron (MLP) was used as a phone boundary detector. To increase the performance of segmentation, a technique which individually trains an MLP according to phonetic transition is proposed. The optimum partitioning of the entire phonetic transition space is constructed from the standpoint of minimizing the overall deviation from hand labelling positions. With single speaker stimuli, the experimental results showed that more than 95% of all phone boundaries have a boundary deviation from the reference position smaller than 20 ms, and the refinement of the boundaries reduces the root mean square error by about 25%.
Performance Improvement in Speech Recognition by Weighting HMM Likelihood
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 145~152
In this paper, assuming that the score of speech utterance is the product of HMM log likelihood and HMM weight, we propose a new method that HMM weights are adapted iteratively like the general MCE training. The proposed method adjusts HMM weights for better performance using delta coefficient defined in terms of misclassification measure. Therefore, the parameter estimation and the Viterbi algorithms of conventional 1:.um can be easily applied to the proposed model by constraining the sum of HMM weights to the number of HMMs in an HMM set. Comparing with the general segmental MCE training approach, computing time decreases by reducing the number of parameters to estimate and avoiding gradient calculation through the optimal state sequence. To evaluate the performance of HMM-based speech recognizer by weighting HMM likelihood, we perform Korean isolated digit recognition experiments. The experimental results show better performance than the MCE algorithm with state weighting.
Wavelet-based Time Delay Estimation in Tomographic Signals
The Journal of the Acoustical Society of Korea, volume 22, issue 2, 2003, Pages 153~161
In this paper, we propose a wavelet-based detection method to identify efficiently the time-delay or multipath channel of ocean acoustic signals due to complex ocean medium and boundary layers. Our proposed method employs wavelet packet transform to analyze the received broadband acoustic signals and applies the matched filter to determine the time region of interest. Also, we present numerical testing that results on both the simulated and real data revealed the efficiency of this method in time-delay estimation and moreover its capability in estimating the time-delay of individual path in multipath channel, in which the arrival patterns are too close to be separated by the matched filter method.