Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
The Journal of the Acoustical Society of Korea
Journal Basic Information
Journal DOI :
The Acoustical Society of Korea
Editor in Chief :
Volume & Issues
Volume 18, Issue 8 - Nov 1999
Volume 18, Issue 7 - Oct 1999
Volume 18, Issue 6 - Aug 1999
Volume 18, Issue 5 - Jul 1999
Volume 18, Issue 4 - May 1999
Volume 18, Issue 3 - Apr 1999
Volume 18, Issue 2 - Feb 1999
Volume 18, Issue 1 - Jan 1999
Volume 18, Issue 4E - 00 1999
Volume 18, Issue 3E - 00 1999
Volume 18, Issue 2E - 00 1999
Volume 18, Issue 1E - 00 1999
Selecting the target year
A Study on the Rejection Capability Based on Anti-phone Modeling
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 3~9
This paper presents the study on the rejection capability based on anti-phone modeling for vocabulary independent speech recognition system. The rejection system detects and rejects out-of-vocabulary words which were not included in candidate words which are defined while the speech recognizer is made. The rejection system can be classified into two categories by their implementation methods, keyword spotting method and utterance verification method. The keyword spotting method uses an extra filler model as a candidate word as well as keyword models. The utterance verification method uses the anti-models for each phoneme for the calculation of confidence score after it has constructed the anti-models for all phonemes. We implemented an utterance verification algorithm which can be used for vocabulary independent speech recognizer. We also compared three kinds of means for the calculation of confidence score, and found out that the geometric mean had shown the best result. For the normalization of confidence score, usually Sigmoid function is used. On using it, we compared the effect of the weight constant for Sigmoid function and determined the optimal value. And we compared the effects of the size of cohort set, the results showed that the larger set gave the better results. And finally we found out optimal confidence score threshold value. In case of using the threshold value, the overall recognition rate including rejection errors was about 76%. This results are going to be adapted for stock information system based on speech recognizer which is currently provided as an experimental service by Korea Telecom.
A Study on Realization of Continuous Speech Recognition System of Speaker Adaptation
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 10~16
In this paper, we have studied Continuous Speech Recognition System of Speaker Adaptation using MAPE (Maximum A Posteriori Probability Estimation) which can adapt any small amount of adaptation speech data. Speaker adaptation is performed by the method of MAPB after Concatenation training which is making sentence unit HMM linked by syllable unit HMM and Viterbi segmentation classifies speech data to be adaptation into segmentation of syllable unit data automatically without hand labelling. For car control speech the recognition rates of adaptation of HMM was 77.18% which is approximately 6% improvement over that of unadapted HMM.(in case of O(n)DP)
Speech Coarticulation Database of Korean and English
;Stephen A. Dyer;Dwight D. Day;
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 17~26
We present the first speech coarticulation database of Korean, English and Konglish/sup 3)/ named "SORIDA"/sup 4)/, which is designed to cover the maximum number of representations of coarticulation in these languages . SORIDA features a compact database which is designed to contain a maximum number of triphones in a minimum number of prompts. SORIDA contains all consonantal triphones and vowel allophones in 682 Korean prompts of word length and in 717 English prompt words, spoken five times by speakers of balanced genders, dialects and ages. Korean prompts are synthesized lexicons which maximize their coarticulation variation disregarding any stress phenomena, while English prompts are natural words that fully reflect their stress effects with respect to the coarticulation variation. The prompts are designed differently because English phonology has stress while Korean does not. An intermediate language, Konglish has also been modeled by two Korean speakers reading 717 English prompt words. Recording was done in a controlled laboratory environment with an AKG Model C-100 microphone and a Fostex D-5 digital-audio-tape (DAT) recorder. The total recording time lasted four hours. SORIDA CD-ROM is available in one disk of 22.05 kHz sampling rate with a 16 bit sample size. SORIDA digital audio-tapes are available in four 124-minute-tapes of 48 kHz sampling rate. SORIDA′s list of phonetically-rich-words is also available in English and Korean.
Directivity Characteristics of Non-Linear Array for Wide-Band One-Shot Beamforming
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 27~34
This paper proposes an algorithm to design the non-linear array so as to form efficiently the one-shot beam with relatively less sensors for acoustic measurement. In this algorithm, according to the spatial sampling theory the part for high frequency(HF) band has equispaced sensor array and the sensor distances below the HF band are decided as a function of number of HF sensors. As the results of the simulations, the mean and variances of directivity index(DI) of non-linear array which has less sensors are similar to those of linear array. and the DI variation for beam steering angle is very small. And the beam width at -2dB point is 6.8°. Thus it is confirmed that the design algorithm for non-linear array which is proposed to have less sensors can be efficiently used in acoustic measurement.
A Study on VQ/HMM using Nonlinear Clustering and Smoothing Method
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 35~42
In this paper, a modified clustering algorithm is proposed to improve the discrimination of discrete HMM(Hidden Markov Model), so that it has increased recognition rate of 2.16% in comparison with the original HMM using the K-means or LBG algorithm. And, for preventing the decrease of recognition rate because of insufficient training data at the training scheme of HMM, a modified probabilistic smoothing method is proposed, which has increased recognition rate of 3.07% for the speaker-independent case. In the experiment applied the two proposed algorithms, the average rate of recognition has increased 4.66% for the speaker-independent case in comparison with that of original VQ/HMM.
A Design of BICS Circuit for IDDQ Testing of Memories
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 43~48
IDDQ testing is one of current testing methodologies which increases circuit's reliability by means of finding defects which can't be detected by functional testing in CMOS circuits. In this paper, we design a Built-In Current Sensor(BICS) circuit, which can be embedded in chip under test, that performs IDDQ testing. Furthermore, it is designed for IDDQ testing of memories and implemented to carry out testing at high-speed by using small number of transistors.
DOA Estimation of New Appearing Source in Wideband Multisource Beamforming with Array Sensor Position Calibration Algorithm
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 49~54
In this paper, we propose a new method to estimate the initial DOA of a new appearing source in wideband multisource beamforming and tacking with array sensor position calibration algorithm. By using a beampattern formula for initial DOA detection, the proposed method keeps estimation error within possible tracking range and can be applied to several beamformers with different mainlobe width by adjusting DOA resolution. The simulation results show the performances of source detection and tracking.
The Analysis of Amplitude and Phase Image for Acoustic Microscope Using Quadrature Technique
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 55~61
In this study, we have constructed the acoustic microscope using quadrature technique and analyzed the relative variation of image intensity and the quality of image by reconstructing the amplitude and phase image for surface defects with tiny hight variation. In this experiment, we have constructed the scanning acoustic microscope using the focused transducer with 3㎒ center frequency and the quadrature detector. And we have fabricated aluminum samples with round defects whose depth is different and reconstructed the amplitude and phase images for the samples. One sample has round defects with 2㎜ diameter and 100㎛ depth and the other has round defects with 4㎜ diameter and 5㎜ depth. In the result of line scanning for the sample with 100㎛ round defects, it has been shown that the variation rate of amplitude image intensity is 7% and the variation rate of phase image intensity is 89%. The phase image has better contrast than amplitude image for the sample. In contrast to this, the amplitude image has better contrast than phase image for the sample with 5㎜ depth's defects. Accordingly there is big difference between amplitude image and phase image for depth variation of defects whose boundary is 1 wavelength. Consequently the acoustic microscope using quadrature detector can be evaluated efficiently more than using envelope detector, for detecting defects which have height variation less than 1 wavelength. And also the phase image and the amplitude image can be used for detecting defects of tiny height variation with complimentary relation.
A Study on Speech Recognition using Recurrent Neural Networks
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 62~67
In this paper, we investigates a reliable model of the Predictive Recurrent Neural Network for the speech recognition. Predictive Neural Networks are modeled by syllable units. For the given input syllable, then a model which gives the minimum prediction error is taken as the recognition result. The Predictive Neural Network which has the structure of recurrent network was composed to give the dynamic feature of the speech pattern into the network. We have compared with the recognition ability of the Recurrent Network proposed by Elman and Jordan. ETRI's SAMDORI has been used for the speech DB. In order to find a reliable model of neural networks, the changes of two recognition rates were compared one another in conditions of: (1) changing prediction order and the number of hidden units: and (2) accumulating previous values with self-loop coefficient in its context. The result shows that the optimum prediction order, the number of hidden units, and self-loop coefficient have differently responded according to the structure of neural network used. However, in general, the Jordan's recurrent network shows relatively higher recognition rate than Elman's. The effects of recognition rate on the self-loop coefficient were variable according to the structures of neural network and their values.
Performance Comparison and Verification of Lip Parameter Selection Methods in the Bimodal Speech ]Recognition System
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 68~72
The choice of parameters from various lip information and the robustness of extracting lip parameters play important roles in the performance of bimodal speech recognition system. In this paper, lip parameters are extracted by using an automatic extraction algorithm and inner lip parameters effect on the recognition rate more than outer lip parameters. Compared with a manual extraction algorithm, the automatic extraction method is evaluated about its robustness.
Flexural Vibration of a Bar with Periodically Nonuniform Material Properties
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 73~78
The paper describes a theoretical study on the flexural vibration of an elastic flat bar with periodically nonuniform material properties. The approximate solution of the natura1 frequency and mode shape has been obtained using the perturbation technique for sinusoidal modulation of the flexural rigidify and mass density. The numerical solution obtained by using the finite element method verifies the trend of the approximate solution. It appears that distributed vibrations exist in the low modes, and this approach can be extended to the vibration analysis of the p1ate in the flat panel speaker.
Implementation of Real-Time Sound Image Control System
The Journal of the Acoustical Society of Korea, volume 18, issue 3, 1999, Pages 79~87
In this paper, we reconstruct an algorithm of sound image control for real-time processing and implement a real-time system using digital signal processing board based on TMS320C40. The performance of real-time sound image control system was evaluated by a listening test. The results of the test showed that localized sound image can be perceived by headphone and speaker, and result of test by headphone was better than that of speaker. In the results of test of elevation perception, result of perception between left and right were better than those for differentiating between front and back, and the moving sound image was better than the fixed sound image.