Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
The Journal of the Acoustical Society of Korea
Journal Basic Information
Journal DOI :
The Acoustical Society of Korea
Editor in Chief :
Volume & Issues
Volume 28, Issue 8 - Nov 2009
Volume 28, Issue 7 - Oct 2009
Volume 28, Issue 6 - Aug 2009
Volume 28, Issue 5 - Jul 2009
Volume 28, Issue 4 - May 2009
Volume 28, Issue 3 - Apr 2009
Volume 28, Issue 2 - Feb 2009
Volume 28, Issue 1 - Jan 2009
Selecting the target year
Audio Signal Format and Coding Method for Ultra High Definition Television (UHDTV)
Seo, Jeong-Il ; Kang, Kyeong-Ok ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 580~588
In this paper, we describe technical trends, standard activities, and upcoming issues relating on UHDTV audio, which requires high quality realistic sound. We also propose a proper solution to it for domestic broadcasting and telecommunication environments.
MPEG-D USAC: Unified Speech and Audio Coding Technology
Lee, Tae-Jin ; Kang, Kyeong-Ok ; Kim, Whan-Woo ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 589~598
As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music content MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved WD3 at the 88th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC ACELP and TCX) for low frequency regions, SBR for high frequency regions and the MPEG Surround tool for stereo information. USAC can provide consistent sound quality for both speech and music content and can be applied to various applications such as multi-media download to mobile device Digital radio Mobile TV and audio books.
MPEG Surround for Multi-Channel Audio Coding-Part 1: Basic Structure
Pang, Hee-Suk ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 599~609
An overview of the recently finalized multi-channel audio coding standard MPEG Surround is provided. This audio coding standard downmixes multi-channel signals to mono or stereo signals and, simultaneously, extracts spatial parameters for its encoding process. In its decoding process, it reconstructs multi-channel signals based on the downmix signals and spatial parameters. Since the downmix signals are coded in conventional audio coding format such as AAC and MP3 and the spatial parameters require a small amount of information MPEG Surround guarantees high sound quality multi-channel audio at low bit rates. Besides, it is backward-compatible to conventional audio coding techniques because the downmix signals can be played on portable audio devices ignoring the spatial parameter information. In this paper, Part 1 presents an overview of the basic structure of MPEG Surround and Part 2 describes various modes and tools including the binaural mode which supports the virtual 5.1-channel playback via headphones or earphones. The listening test results by various companies and organizations are also presented.
MPEG Surround for Multi-Channel Audio Coding-Part 2: Various Modes and Tools
Pang, Hee-Suk ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 610~617
An overview of various modes and tools of MPEG Surround is provided Because the binaural mode of MPEG Surround supports the virtual 5.1-channel playback based on HRTFs, it can be played via headphones and earphones for portable audio devices. MPEG Surround also supports the enhanced matrix mode which converts stereo signals to 5.1-channel signals without side information, the 3D stereo mode which deals with 3D-coded signals, the low power version which greatly reduces the computational load in the decoding process. Besides, MPEG Surround provides the arbitrary downmix gains (ADGs) tool which is applied to artistic downmix signals, the matrix compatibility tool which is applied to downmix signals by conventional matrix-based methods, the residual coding tool -which can be used at high bit rates, and the GES tool which is applied to specific sound such as applause. The listening test results by various companies and organizations are also presented for important modes and tools.
MPEG-4 ALS - The Standard for Lossless Audio Coding
Liebchen, Tilman ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 618~629
The MPEG-4 Audio Lossless Coding (ALS) standard belongs to the family MPEG-4 audio coding standards. In contrast to lossy codecs such as AAC, which merely strive to preserve the subjective audio quality, lossless coding preserves every single bit of the original audio data. The ALS core codec is based on forward-adaptive linear prediction, which combines remarkable compression with low complexity. Additional features include long-term prediction, multichannel coding, and compression of floating-point audio material. This paper describes the basic elements of the ALS codec with a focus on prediction, entropy coding, and related tools and points out the most important applications of this standardized lossless audio format.
Audio Object Coding Standard Technology - MPEG SAOC
Jung, Yang-Won ; Oh, Hyen-O ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 630~639
This paper introduces MPEG SAOC (Spatial Audio Object Coding) that has been standardized in MPEG audio subgroup. MPEG SAOC is a trendy parametric coding technology conceptually similar to PS (Parametric Stereo) and the MPEG Surround. SAOC especially parameterizes and codes the spatial information for the object signals comprising a downmixed audio scene and thus lets users render one's preferred scene in an interactive manner.
A Constant Modulus Algorithm Based on an Orthogonal Projection
Lim, Jun-Seok ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 640~645
CMA (Constant Modulus Algorithm) is one of the famous algorithms in blind channel equalization. Generally, CMA converges slowly and the speed of convergence is dependent on a step-size in the CMA procedure. Many researches have tried to speed up the convergence speed by applying a variable step-size to CMA. In this paper, we propose a new CMA algorithm with improved convergence performance. The improvement comes from an orthogonal projection of an average error gradient. We show the improvement in simulation results.
Fatigue Life Optimization of Spot Welding Nuggets Considering Vibration Mode of Vehicle Subframe
Lee, Sang-Beom ; Lee, Hyuk-Jae ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 646~652
In this paper, welding pitch optimization technique of vehicle subframe is presented considering the fatigue life of spot welding nuggets. Fatigue life of spot welding nuggets is estimated by using the frequency-domain fatigue analysis technique. The input data, which are used in the fatigue analysis, are obtained by performing the dynamic analysis of vehicle model passing through the Belgian road profile and also the modal frequency response analysis of finite element model of vehicle subframe. According to the fatigue life result obtained from the frequency-domain fatigue analysis, the design points to optimize the weld pitch distance are determined. For obtaining the welding pitch combination to maximize the fatigue life of the spot welding nuggets, 4-factor, 3-level orthogonal array experimental design is used. This study shows that the optimized subframe improves the fatigue life of welding nugget with minimum fatigue life about 65.8 % as compared with the baseline design.
Robust Audio Watermarking Algorithm with Less Deteriorated Sound
Kang, Myeong-Su ; Cho, Sang-Jin ; Chong, Ui-Pil ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 653~660
This paper proposes a robust audio watermarking algorithm for copyright protection and improvement of sound quality after embedding a watermark into an original sound. The proposed method computes the FFT (fast Fourier transform) of the original sound signal and divides the spectrum into a subbands. Then, it is necessary to calculate the energy of each subband and sort n subbands in descending order corresponding to its power. After calculating the energy we choose k subbands in sorted order and find p peaks in each selected subbands, and then embed a length m watermark around the p peaks. When the listeners hear the watermarked sound, they do not recognize any distortions. Furthermore, the proposed method is robust as much as Cox's method to MP3 compression, cropping, FFT echo attacks. In addition to this, the experimental results show that the proposed method is generally 10 dB higher than Cox's method in SNR (signal-to-noise ratio) aspect.
Estimation of Structural Properties from the Measurements of Phase Velocity and Attenuation Coefficient in Trabecular Bone
Lee, Kang-Il ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 661~667
Trabecular-bone-mimicking phantoms consisting of parallel-nylon-wire arrays were used to investigate correlations of phase velocity and attenuation coefficient with structural properties in trabecular bone. Trabecular separation (Tb.Sp) of the 7 trabecular-bone-mimicking phantoms ranged from 300 to
and volume fraction (VF) from 1.6% to 8.7%. Phase velocity and attenuation coefficient of the phantoms were measured by using a through-transmission method in water, with a matched pair of broadband unfocused transducers with a diameter of 12.7 mm and a center frequency of 1 MHz. Phase velocity and attenuation coefficient at 1 MHz decreased almost linearly with increasing Tb. Sp and increased almost linearly with increasing VF. The simple and multiple linear regression models with phase velocity and attenuation coefficient as independent vanables and Tb.Sp and VF as dependent variables demonstrated that the coefficients of determination for the prediction of VF were higher than those for the prediction of Tb.Sp. The results obtained in the trabecular-bone-mimicking phantoms consisting of parallel-nylon-wire arrays were consistent with those in human trabecular bone suggesting that the structural properties can be estimated from the measurements of phase velocity and attenuation coefficient in trabecular bone.
Minima Controlled Speech Presence Uncertainty Tracking Method for Speech Enhancement
Lee, Woo-Jung ; Chang, Joon-Hyuk ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 668~673
In this paper, we propose the minima controlled speech presence uncertainty tracking method to improve a speech enhancement. In the conventional tracking speech presence uncertainty, we propose a method for estimating distinct values of the a priori speech absence probability for different frames and channels. This estimation is inherently based on a posteriori SNR and used in estimating the speech absence probability (SAP). In this paper, we propose a novel estimation of distinct values of the a priori speech absence probability, which is based on minima controlled speech presence uncertainty tracking method, for different frames and channels. Subsequently, estimation is applied to the calculation of speech absence probability for speech enhancement. Performance of the proposed enhancement algorithm is evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various noise environments. We show that the proposed algorithm yields better results compared to the conventional tracking speech presence uncertainty.
Retrieval of Player Event in Golf Videos Using Spoken Content Analysis
Kim, Hyoung-Gook ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 674~679
This paper proposes a method of player event retrieval using combination of two functions: detection of player name in speech information and detection of sound event from audio information in golf videos. The system consists of indexing module and retrieval module. At the indexing time audio segmentation and noise reduction are applied to audio stream demultiplexed from the golf videos. The noise-reduced speech is then fed into speech recognizer, which outputs spoken descriptors. The player name and sound event are indexed by the spoken descriptors. At search time, text query is converted into phoneme sequences. The lists of each query term are retrieved through a description matcher to identify full and partial phrase hits. For the retrieval of the player name, this paper compares the results of word-based, phoneme-based, and hybrid approach.
A New Unified System of Acoustic Echo and Noise Suppression Incorporating a Novel Noise Power Estimation
Park, Yun-Sik ; Chang, Joon-Hyuk ;
The Journal of the Acoustical Society of Korea, volume 28, issue 7, 2009, Pages 680~685
In this paper, we propose a efficient noise power estimation technique for an integrated acoustic echo and noise suppression system in a frequency domain. The proposed method uses speech absence probability (SAP) derived from the microphone input signal as the smoothing parameter updating noise power to reduce the noise power estimation error resulted from the distortions in the unified structure where the noise suppression (NS) operation is placed after the acoustic echo suppression (AES) algorithm. Therefore, in the proposed approach, the smoothing parameter based on SAP derived from the input signal instead of echo-suppressed signal should stop updating noise power estimates during the distorted noise spectrum periods. The performance of the proposed algorithm is evaluated by the objective test under various environments and yields better results compared with the conventional scheme.