Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
The Journal of the Acoustical Society of Korea
Journal Basic Information
Journal DOI :
The Acoustical Society of Korea
Editor in Chief :
Volume & Issues
Volume 30, Issue 8 - Nov 2011
Volume 30, Issue 7 - Oct 2011
Volume 30, Issue 6 - Aug 2011
Volume 30, Issue 5 - Jul 2011
Volume 30, Issue 4 - May 2011
Volume 30, Issue 3 - Apr 2011
Volume 30, Issue 2 - Feb 2011
Volume 30, Issue 1 - Jan 2011
Selecting the target year
Audio Watermarking Using Quantization Index Modulation on Significant Peaks in Frequency Domain
Kang, Jung-Sun ; Cho, Sang-Jin ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 303~307
DOI : 10.7776/ASK.2011.30.6.303
This paper describes an audio watermarking using Quantization Index Modulation (QIM) on significant peaks in frequency domain. The audio signal is broken up into L samples length frames with non-overlapping and rectangular window. The zero-crossing rate of each frame is calculated for decision whether it is proper to be watermarked or not. If the frame is legitimate, frequency magnitude response is computed by discrete Fourier transform. For the QIM, we set the quantization step size based on maximum value of frequency magnitude response and select n significant peaks with w samples around them in frequency domain, totally
samples. Finally, watermark embedding is performed. Decoder extract watermarks based on Euclidean distance, that is a blind detection. The proposed method is robust against many attacks of watermark benchmark.
Direction Estimation of Multiple Sound Sources Using Circular Probability Distributions
Nam, Seung-Hyon ; Kim, Yong-Hoh ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 308~314
DOI : 10.7776/ASK.2011.30.6.308
This paper presents techniques for estimating directions of multiple sound sources ranging from
using circular probability distributions having a periodic property. Phase differences containing direction information of sources can be modeled as mixtures of multiple probability distributions and source directions can be estimated by maximizing log-likelihood functions. Although the von Mises distribution is widely used for analyzing this kind of periodic data, we define a new class of circular probability distributions from Gaussian and Laplacian distributions by adopting a modulo operation to have
-periodicity. Direction estimation with these circular probability distributions is done by implementing corresponding EM (Expectation-Maximization) algorithms. Simulation results in various reverberant environments confirm that Laplacian distribution provides better performance than von Mises and Gaussian distributions.
Rapid Speaker Adaptation Based on MAPLR with Adaptive Hybrid Priors Estimated from Reference Speakers
Song, Young-Rok ; Kim, Hyung-Soon ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 315~323
DOI : 10.7776/ASK.2011.30.6.315
This paper proposes two methods of estimating prior distribution to improve the performance of rapid speaker adaptation based on maximum a posteriori linear regression (MAPLR). In general, prior distribution of the transformation matrix used in MAPLR adaptation is estimated from all of the training speakers who are employed to construct the speaker-independent model, and it is applied identically to all new speakers. In this paper, we propose a method in which prior distribution is estimated from a group of reference speakers, selected using adaptation data, so that the acoustic characteristics of the selected reference speakers may be similar to that of the new speaker. Additionally, in MAPLR adaptation with block-diagonal transformation matrix, we propose a method in which the mean matrix and covariance matrix of prior distribution are estimated from two groups of transformation matrices obtained from the same training speakers, respectively. To evaluate the performance of the proposed methods, we examine word accuracy according to the number of adaptation words in the isolated word recognition task. Experimental results show that, for very limited adaptation data, statistically significant performance improvement is obtained in comparison with the conventional MAPLR adaptation.
A Statistical Model-Based Voice Activity Detection Employing the Conditional MAP Criterion with Spectral Deviation
Kim, Sang-Kyun ; Chang, Joon-Hyuk ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 324~329
DOI : 10.7776/ASK.2011.30.6.324
In this paper, we propose a novel approach to improve the performance of a statistical model-based voice activity detection (VAD) which is based on the conditional maximum a posteriori (CMAP) with deviation. In our approach, the VAD decision rule is expressed as the geometric mean of likelihood ratios (LRs) based on adapted threshold according to the speech presence probability conditioned on both the speech activity decisions and spectral deviation in the pervious frame. Experimental results show that the proposed approach yields better results compared to the CMAP-based VAD using the LR test.
Improvement in Supervector Linear Kernel SVM for Speaker Identification Using Feature Enhancement and Training Length Adjustment
So, Byung-Min ; Kim, Kyung-Wha ; Kim, Min-Seok ; Yang, Il-Ho ; Kim, Myung-Jae ; Yu, Ha-Jin ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 330~336
DOI : 10.7776/ASK.2011.30.6.330
In this paper, we propose a new method to improve the performance of supervector linear kernel SVM (Support Vector Machine) for speaker identification. This method is based on splitting one training datum into several pieces of utterances. We use four different databases for evaluating performance and use PCA (Principal Component Analysis), GKPCA (Greedy Kernel PCA) and KMDA (Kernel Multimodal Discriminant Analysis) for feature enhancement. As a result, the proposed method shows improved performance for speaker identification using supervector linear kernel SVM.
Audio Format Comparative Study and Suggestion for Next Generation DTV
Lee, Jae-Hong ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 337~343
DOI : 10.7776/ASK.2011.30.6.337
With commencing trial 3D digital broadcasting, the studies on next generation digital broadcasting technology for coming UHDTV era is being actively progressing. In this paper, I propose surround audio formats for next-generation digital TV broadcasting, along with comparative study of major surround audio formats in use or under development. I did comparative study on current major competing surround formats such as Dolby True HD and DTS HD MA, along with NHK proposed 22.2 channel surround format for UHDTV system. Upon this comparative study and our housing situation consideration, I propose lossy compression 3D surround 7.1 channel surround format along with loosless 2.0 and 4.0 hi-fi format as next generation digital TV broadcasting standard. In lieu with this, I also propose transmitting binaural 2 channel audio data as sub-audio. It will give holographic sound experience when properly processed with individual HRTF (Head Related Transfer Function) with headphone. The table for data rate of each proposed audio format is also presented.
Analysis of the Resonant Characteristics of a Tonpilz Transducer with a Fixed Tail Mass by the Equivalent Circuit Approach
Kim, Jin-Wook ; Kim, Won-Ho ; Joh, Chee-Young ; Roh, Yong-Rae ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 344~352
DOI : 10.7776/ASK.2011.30.6.344
In this paper, the resonant characteristic of a Tonpilz transducer with a fixed tail mass has been studied by means of an equivalent circuit approach. An equivalent circuit has been designed to describe the characteristic of a Tonpilz transducer that has an additional resonance because of its fixed tail mass. The transmitting voltage response of the transducer calculated by the designed circuit has been compared with that by the FEA (finite element analysis) to confirm the validity of the circuit. This equivalent circuit approach produces identical results with the FEA, in which the variation of resonant frequencies and TVR has been clearly figured out in relation to the stiffness of the mounting fixture and the mass of the tail mass. The suggested equivalent circuit can be utilized to figure out the characteristics of the Tonpilz transducer more efficiently than FEA that requires much calculation time and revision of the models in accordance with the variation of design variables.
Vocal Enhancement for Improving the Performance of Vocal Pitch Detection
Lee, Se-Won ; Song, Chai-Jong ; Lee, Seok-Pil ; Park, Ho-Chong ;
The Journal of the Acoustical Society of Korea, volume 30, issue 6, 2011, Pages 353~359
DOI : 10.7776/ASK.2011.30.6.353
This paper proposes a vocal enhancement technique for improving the performance of vocal pitch detection in polyphonic music signal. The proposed vocal enhancement technique predicts an accompaniment signal from the input signal and generates an accompaniment replica signal according to the vocal power. Then, it removes the accompaniment replica signal from the input signal, resulting in a vocal-enhanced signal. The performance of the proposed method was measured by applying the same vocal pitch extraction method to the original and the vocal-enhanced signal, and the vocal pitch detection accuracy was increased by 7.1 % point in average.