Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
The Journal of the Acoustical Society of Korea
Journal Basic Information
Journal DOI :
The Acoustical Society of Korea
Editor in Chief :
Volume & Issues
Volume 22, Issue 8 - Nov 2003
Volume 22, Issue 7 - Oct 2003
Volume 22, Issue 6 - Aug 2003
Volume 22, Issue 5 - Jul 2003
Volume 22, Issue 4 - May 2003
Volume 22, Issue 3 - Apr 2003
Volume 22, Issue 2 - Feb 2003
Volume 22, Issue 1 - Jan 2003
Volume 22, Issue 1E - 00 2003
Selecting the target year
Adaptive System Identification Using an Efficient Recursive Total Least Squares Algorithm
최낙진 ; 임준석 ; 송준일 ; 성굉모 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 93~93
We present a recursive total least squares (RTLS) algorithm for adaptive system identification. So far, recursive least squares (RLS) has been successfully applied in solving adaptive system identification problem. But, when input data contain additive noise, the results from RLS could be biased. Such biased results can be avoided by using the recursive total least squares (RTLS) algorithm. The RTLS algorithm described in this paper gives better performance than RLS algorithm over a wide range of SNRs and involves approximately the same computational complexity of O(N²).
Multi Mode Harmonic Transform Coding for Speech and Music
김종학 ; 신재현 ; 이인성 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 101~101
A multi-mode harmonic transform coding (MMHTC) for speech and music signals is proposed. Its structure is organized as a linear prediction model with an input of harmonic and transform-based excitation. The proposed coder also utilizes harmonic prediction and an improved quantizer of excitation signal. To efficiently quantize the excitation of music signals, the modulated lapped transform(MLT) is introduced. In other words, the coder combines both the time domain (linear prediction) and the frequency domain technique to achieve the best perceptual quality. The proposed coder showed better speech quality than that of the 8 kbps QCELP coder at a bit-rate of 4 kbps.
Filtering of a Dissonant Frequency for Speech Enhancement
강상기 ; 백성준 ; 이기용 ; 성굉모 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 110~110
There have been numerous studies on the enhancement of the noisy speech signal. In this paper, we propose a completely new speech enhancement scheme, that is, a filtering of a dissonant frequency (especially F# in each octave of the tempered scale) based on the fundamental frequency which is developed in frequency domain. In order to evaluate the performance of the proposed enhancement scheme, subjective tests (MOS tests) were conducted. The subjective test results indicate that the proposed method provides a significant gain in audible improvement especially for speech contaminated by colored noise and speaking in a husky voice. Therefore when the filter is employed as a pre-filter for speech enhancement, the output speech quality and intelligibility is greatly enhanced.
Robust Speech Detection Based on Useful Bands for Continuous Digit Speech over Telephone Networks
지미경 ; 서영주 ; 김회린 ; 김상훈 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 113~113
One of the most important problems in speech recognition is to detect the presence of speech in adverse environments. In other words, the accurate detection of speech boundary is critical to the performance of speech recognition. Furthermore the speech detection problem becomes severer when recognition systems are used over the telephone network, especially wireless network and noisy environment. Therefore this paper describes various speech detection algorithms for continuous digit recognition system used over wire/wireless telephone networks and we propose a algorithm in order to improve the robustness of speech detection using useful band selection under noisy telephone networks. In this paper, we compare some speech detection algorithms with the proposed one, and present experimental results done with various SNRs. The results show that the new algorithm outperforms the other speech detection methods.
A Simple Speech/Non-speech Classifier Using Adaptive Boosting
권오욱 ; 이태원 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 124~124
We propose a new method for speech/non-speech classifiers based on concepts of the adaptive boosting (AdaBoost) algorithm in order to detect speech for robust speech recognition. The method uses a combination of simple base classifiers through the AdaBoost algorithm and a set of optimized speech features combined with spectral subtraction. The key benefits of this method are the simple implementation, low computational complexity and the avoidance of the over-fitting problem. We checked the validity of the method by comparing its performance with the speech/non-speech classifier used in a standard voice activity detector. For speech recognition purpose, additional performance improvements were achieved by the adoption of new features including speech band energies and MFCC-based spectral distortion. For the same false alarm rate, the method reduced 20-50% of miss errors.
Time Series Simulation of Explosive Charges In Shallow Water Using Ray Approach
한주영 ; 이성욱 ; 나정열 ;
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 133~133
A time series simulation is presented by a ray approach for the simulating the received waveform of a broadband acoustical signals interacting with the ocean boundaries. The environment is assumed to be horizontally stratified, and the seafloor is described in terms of homogeneous fluid half-space. The ray approach includes the effects of reflection from the air-water, water-sediment interface and phase shifts due to boundaries interaction. To generate time series, we assume that the acoustic energy propagates from source to receiver along eigenrays and represent the action of the bottom on the incident wave by a linear filter and characterized in the frequency domain by the transfer function. As example application, the time series for an explosive source in a shallow water environment is calculated and analyzed in terms of acoustical process. good agreement with measured time series is demonstrated.
Measurements of Scattering Coefficients Using the ISO Method in a Model Reverberation Chamber
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 162~168
The degree of diffusion, characterized by the "scattering coefficient" of surface materials, has been known to be one of the most important factors in determining the acoustical qualities of concert halls. Based on the suggested ISO method, which measures the random-incidence scattering coefficient of surfaces in a diffuse field, the scattering coefficients of different sizes and densities of wooden hemispheres and cubes were measured in model-scale reverberation rooms. As a result, wooden hemispheres with a structural depth of more than 15㎝ have the highest average (500㎐∼4㎑) scattering coefficient. It was also found that the scattering coefficient becomes higher when the diffuser density reaches about 50% for hemispheres and 30% for cubes.
Subspace Speech Enhancement Using Subband Whitening Filter
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 169~174
A novel subspace speech enhancement using subband whitening filter is proposed. Previous subspace speech enhancement method either assumes additive white noise or uses whitening filter as a pre-processing for colored noise. The proposed method tries to minimize the signal distortion while reducing residual noise by processing the signal using subband whitening filter. By incorporating the notion of subband whitening filter, spectral resolution in Karhunen-Loeve(KL) domain is improved with the negligible additional computational load. The proposed method outperforms both the subspace method suggested by Ephraim and the spectral subtraction suggested by Boll in terms of segmental signal-to-noise ratio (SNRseg) and perceptual evaluation of speech quality (PESQ).
A Study on Korean 4-connected Digit Recognition Using Demi-syllable Context-dependent Models
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 175~181
Because a word of Korean digits is a syllable and deeply coarticulatied in connected digits, some recognition models based on demisyllables have been proposed by researchers. However, they could not show an excellent recognition results yet. This paper proposes a recognition model based on extended and context-dependent demisyllables, such as a tri-demisyllable like a tri-phone, for the Korean 4-connected digits recognition. For experiments, we use a toolkit of HTK 3.0 for building this model of continuous HMMs using training Korean connected digits from SiTEC database and for recognizing unknown ones. The results show that the recognition rate is 92% and this model has an ability to improve the recognition performance of Korean connected digits.
A Study on Speaker Recognition Algorithm Through Wire/Wireless Telephone
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 182~187
In this thesis, we propose the algorithm to improve the performance of speaker verification that is mapping feature parameters by using RBF neural network. There is a big difference between wire vector region and wireless one which comes from the same speaker. For wire/wireless speakers model production, speaker verification system should distinguish the wire/wireless channel that based on speech recognition system. And the feature vector of untrained channel models is mapped to the feature vector(LPC Cepstrum) of trained channel model by using RBF neural network. As a simulation result, the proposed algorithm makes 0.6%∼10.5% performance improvement compared to conventional method such as cepstral mean subtraction.
Improvements in Speaker Adaptation Using Weighted Training
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 188~193
Regardless of the distribution of the adaptation data in the testing environment, model-based adaptation methods that have so far been reported in various literature incorporates the adaptation data undiscriminatingly in reducing the mismatch between the training and testing environments. When the amount of data is small and the parameter tying is extensive, adaptation based on outlier data can be detrimental to the performance of the recognizer. The distribution of the adaptation data plays a critical role on the adaptation performance. In order to maximally improve the recognition rate in the testing environment using only a small number of adaptation data, supervised weighted training is applied to the structural maximum a posterior (SMAP) algorithm. We evaluate the performance of the proposed weighted SMAP (WSMAP) and SMAP on TIDIGITS corpus. The proposed WSMAP has been found to perform better for a small amount of data. The general idea of incorporating the distribution of the adaptation data is applicable to other adaptation algorithms.
A Study on Utterance Verification Using Accumulation of Negative Log-likelihood Ratio
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 194~201
In speech recognition, confidence measuring is to decide whether it can be accepted as the recognized results or not. The confidence is measured by integrating frames into phone and word level. In case of word recognition, the confidence measuring verifies the results of recognition and Out-Of-Vocabulary (OOV). Therefore, the post-processing could improve the performance of recognizer without accepting it as a recognition error. In this paper, we measure the confidence modifying log likelihood ratio (LLR) which was the previous confidence measuring. It accumulates only those which the log likelihood ratio is negative when integrating the confidence to phone level from frame level. When comparing the verification performance for the results of word recognizer with the previous method, the FAR (False Acceptance Ratio) is decreased about 3.49% for the OOV and 15.25% for the recognition error when CAR (Correct Acceptance Ratio) is about 90%.
Focal Length Control of Line-focus Ultrasonic Transducer Using Bimorph-type Bending Actuator
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 202~207
For medical ultrasonic transducer, phase-weighting method has been used for controlling focal length with electric circuit at each vibrating element. However, the electric circuit is complex as the number of vibrating elements is increased. In this paper, we fabricated line-focus transducer with a bimorph-type piezoelectric actuator. The polyvinylidene fluoride (PVDF) piezoelectric type polymer film is used for transmitting and receiving of ultrasonic signal. Using this transducer, focal length of the transducer can be controlled mechanically by changing voltage of the actuator. It is confirmed that focal length of the transducer can be controlled in range of 1095 to radius of curvature.
A Study for Reducing the Acoustic Cross Talk Level in an Array Type Piezoelectric Ultrasonic Transducer Using Acoustic Wells
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 208~216
In one dimensional linear array type piezoelectric ultrasonic transducers widely used for medical diagnosis, the acoustic cross talk caused by the structural acoustic coupling between the adjacent piezoelectric elements reduces significantly their performance. In the study, we have proposed an acoustic wall to reduce the acoustic cross talk by wave propagation through the surface the transducer which can not be prevented by conventional kerf and have analyzed using a finite element method the acoustic cross talk level with respect to the shape, size and materials of the acoustic wall mounted on a convex one dimensional piezoelectric ultrasonic transducer. We expect that the simulated results provide us with a valuable information to make an optimized design of the way type ultrasonic transducer minimizing the acoustic cross talk level.
A Novel Cooling Method by Acoustic Streaming Induced by Ultrasonic Resonator
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 217~223
A novel cooling method induced by acoustic streaming generated by ultrasonic vibration at 30㎑ is presented. Ultrasonic vibration is obtained by piezoelectric devices and the maximum vibration amplitude of 50 m is achieved by including a horn, mechanical vibration amplifier in the system and making the complete system resonate. To investigate the enhancement of heat transfer capability of acoustic streaming, the temperature variations of heat source and air in the vicinity of heat source are measured in real-time. It is observed that acoustic streaming is instantly induced by ultrasonic vibration, resulting in the significant temperature drop due to the bulk air flow caused by acoustic streaming. In addition, it is observed that the cooling effect on the heat source is maximized when the gap between the ultrasonic vibrator and heat source coincides with the multiples of half-wavelength of the ultrasonic wave. This fact results from the resonance of the sound wave. The theoretical analysis of the dependence on the gap is also accomplished and verified by experiment. The advantage of the proposed cooling method by acoustic streaming is noise-free due to the ultrasonic vibration and maintenance-free because of the absence of moving parts. Moreover. This cooling method can be utilized to the nano and micro-electro mechanical systems, where the fan-based conventional cooling method can not be employed.
Comparison of Active Sonar Target Positioning Performance and Optimal Sensor Arrangement
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 224~232
In this paper, efficient deployment method of sensors and target positioning performance with respect to measurement error are dealt with. Active sonar can be categorized into Monostatic, Bistatic, Multistatic sonar, and characteristics of respective sonar are different. Assuming that each sensor can receive range and angular information, we compare the performance of Monostatic, Bistatic, and Multistatic systems. And we suggest Weighted least square (WLS) which gives the weight to former case, LS. In particular. adopting suggested method we investigate the target positioning performance according to number of sensor, distance from transmitter to receiver, and propose efficient arrangement rule for Multistatic sonar configurations. According to the experimental results, RMSE of Multistatic sonar is found to be superior to Monostatic and Bistatic by 35.98%. 37.45% respectively, and WLS is superior to LS approximately by 7.4% in average. Furthermore, as the difference of respective sensor's variance is large, it is observed that the improvement ratio of target positioning performance is increased.
Analysis of Dependence on Wind Speed and Ship Traffic of Underwater Ambient Noise at Shallow Sea Surrounding the Korean Peninsula
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 233~241
It is statistically analyzed the underwater ambient noise measured at 13 sites less than 200 m deep in the shallow water surrounding the Korean Peninsula for 9 yews from 1990 to 1998 in various environmental conditions. Frequency spectra were obtained with the 1/3-octave band center frequencies from 25㎐ to 20 ㎑. The analyzed shallow water noise spectra were some different from the deep water blown as the Wenz spectra. We could know that the ambient noise level shows higher than it in same condition by effect of various ship activity and the coastal noise, surface waves, and so on. As a result, we produced the coastal ambient noise spectra curve based on these results in shore of the Korea Peninsula.
A Prioritized call Admission for supporting voice Activated/Controlled Services in Cellular CDMA Systems
The Journal of the Acoustical Society of Korea, volume 22, issue 3, 2003, Pages 242~249
When special voice control application services (VCS) such as voice-controlled web browsing or voice-controlled stock transactions are introduced in cellular systems, a channel quality better than that for ordinary voice communications service (OVS) is necessary in order to keep a suitable grade of VCS. To avoid ai. congestion, calls are normally admitted if there exists a channel-processing resource not occupied by other calls in the base as well as the interference level at the receiver is not higher than a predefined threshold. The threshold is usually 10㏈ noise-rise over the background noise level for voice communications service. When the base admits VCS attempts in exactly the same manner as it handles OVS calls. the same fraction of those will be not successful in taking the channel and then blocked. If the same noise-rise threshold is used as 10 ㏈, however, the admitted VCS calls might suffer from bad channel qualify and finally be dropped. From the user's point of view, the forced termination of ongoing calls is significantly undesirable than blocking new call attempts. When using a lower noise-rise threshold for VCS. on the other hand, the blocking probability of VCS gets higher than that of OVS. In this paper, a call admission policy that gives a priority to VCS is considered in order to reduce the blocking probability and keep an adequate channel quality.