Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Phonetics and Speech Sciences
Journal Basic Information
Journal DOI :
The Korean Society of Speech Sciences
Editor in Chief :
Volume & Issues
Volume 3, Issue 4 - Dec 2011
Volume 3, Issue 3 - Sep 2011
Volume 3, Issue 2 - Jun 2011
Volume 3, Issue 1 - Mar 2011
Selecting the target year
A study on the Suprasegmental Parameters Exerting an Effect on the Judgment of Goodness or Badness on Korean-spoken English
Kang, Seok-Han ; Rhee, Seok-Chae ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 3~10
This study investigates the role of suprasegmental features with respect to the intelligibility of Korean-spoken English judged by Korean and English raters as being good or bad. It has been hypothesized that Korean raters would have different evaluations from English native raters and that the effect may vary depending on the types of suprasegmental factors. Four Korean and four English native raters, respectively, took part in the evaluation of 14 Korean subjects' English speaking. The subjects read a given paragraph. The results show that the evaluation for 'intelligibility' is different for the two groups and that the difference comes from their perception of L2 English suprasegmentals.
The Influence of Chinese Falling-rising Tone on the Pitch of Sino-Korean Words Pronounced by Chinese Learners: Focusing on Same-form-same-meaning Words
Kim, Young-Joo ; Liu, Si-Yang ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 11~22
The purpose of this study is to find the influence of Chinese falling-rising tone on the pitch pattern of corresponding Sino-Korean words delivered by Chinese learners of Korean and to examine how the falling-rising tone of corresponding Chinese words affect the pitch patterns of Sino-Korean words. The scope of this research is limited to Chinese learners of Korean, especially when they pronounce same-form-same-meaning Sino-Korean words. In this study, Chinese learners pronounced both Chinese words and corresponding Sino-Korean words. Learners' pitch patterns were recorded and analyzed using software and compared with the tone of corresponding Chinese words. Experimental results showed that Sino-Korean words were affected by Chinese 'falling-rising tone - high and level tone' when they started with lenis sounds. On the other hand, when Sino-Korean words started with aspirated sounds they were affected by Chinese 'falling-rising tone - high and level tone', 'falling-rising tone - falling-rising tone', and 'falling-rising tone - falling tone'. In conclusion, the Chinese learners' pitch patterns of Sino-Korean words are affected by Chinese falling-rising tone, especially when Sino-Korean words start with aspirated sounds.
Artificial Neural Network Prediction of Midsagittal Pharynx Shape from Ultrasound Images for English Speech
Nam, Ho-Sung ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 23~28
Electromagnetometers (EMA) have been widely used in articulatory studies as their temporal resolution can capture most speech activities and the fleshpoint information allows one to readily quantify and analyze tongue shape. However, the drawback is that the data lacks details of activity in the pharyngeal region. Several studies have attempted to estimate the unknown pharyngeal shape of the tongue, but few studies are based on unimodal data containing both front and back regions of the tongue at the same time. We use Stone's ball bearing method to obtain fleshpoint data as well as tongue shape. We further introduce a novel way of connecting balls and attaching them onto the tongue to ensure accurate tracking. An Artificial Neural Network is applied to build a map between observable flesh-points, unknown tongue shape, and pharyngeal region and is optimized to efficiently address nonlinearity.
Coordinations of Articulators in Korean Place Assimilation
Son, Min-Jung ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 29~35
This paper examines several articulatory properties of /k/, known as a trigger of place assimilation as well as the object of post-obstruent tensing (/tk/), in comparison to non-assimilating controls (/kk/ and /kt/). Using EMMA, tongue body articulation in the place assimilation context robustly shows greater spatio-temporal articulation and lower jaw position. Results showed several characteristics. Firstly, constriction duration of the tongue body gesture in C2 of the assimilation context (/tk/) was longer than non-assimilating controls (/kk/ and /kt/). Secondly, constriction maxima also demonstrated greater constriction in the /tk/ sequences than in the control /kk/, but similar values with the control /kt/. In particular, results showed a significant relationship between the two variables - the longer the constriction duration, the greater the constriction degree. Lastly, jaw height was lower for the assimilating context /tk/, intermediate for the control /kk/, and higher for the control /kt/. Results suggest that speakers have lexical knowledge of place assimilation, producing a greater tongue body gesture in the spatio-temporal domains with lower jaw height as an indication of anticipating reduction of C1 in /tk/ sequences.
Pitch Accent Realization in North Kyungsang Korean: Tonal Alignment as a Function of Nasal Position in Syllables
Sohn, Hyang-Sook ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 37~52
This study investigates patterns of the alignment of the accentual peaks in bisyllabic words of the CVNCV, CVNV, and CVNNV structures in North Kyungsang Korean. Based on the tonal alignment, patterns of the F0 pitch excursion are discussed relative to one another. Issues are addressed concerning how the tonal targets are aligned, and how the tonal specifications of nasals in postvocalic, intervocalic, and prevocalic environments are supplied in the LH, HL, and HH classes. Tonal specification of nasals in various environments is accounted for by extension of the L target, displacement of the pitch peak, and interpolation between two tonal targets, depending on the tonal class. The results in this study provide preliminary evidence that the categorical alignment of the tonal targets is implemented by simply checking the presence or absence of a nasal before or after the nucleus vowel on the segmental string, without reference to the constituency of the nasal in the syllable structure. However, the prosodic structure has a key role to play in explaining speaker-dependent variations in the tonal alignment. Sensitivity to tautosyllabicity has an effect on the shape of the F0 contour, and disparity in the patterns of the pitch excursion is represented as a function of syllable structure correlated with segmental composition of the nasal.
The interlanguage Speech Intelligibility Benefit for Korean Learners of English: Production of English Front Vowels
Han, Jeong-Im ; Choi, Tae-Hwan ; Lim, In-Jae ; Lee, Joo-Kyeong ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 53~61
The present work is a follow-up study to that of Han, Choi, Lim and Lee (2011), where an asymmetry in the source segments eliciting the interlanguage speech intelligibility benefit (ISIB) was found such that the vowels which did not match any vowel of the Korean language were likely to elicit more ISIB than matched vowels. In order to identify the source of the stronger ISIB in non-matched vowels, acoustic analyses of the stimuli were performed. Two pairs of English front vowels [i] vs. [I], and
were recorded by English native talkers and two groups of Korean learners according to their English proficiency, and then their vowel duration and the frequencies of the first two formants (F1, F2) were measured. The results demonstrated that the non-matched vowels such as [I], and
produced by Korean talkers seemed to show more deviated acoustic characteristics from those of the natives, with longer duration and with closer formant values to the matched vowels, [i] and
, than those of the English natives. Combining the results of acoustic measurements in the present study and those of word identification in Han et al. (2011), we suggest that relatively better performance in word identification by Korean talkers/listeners than the native English talkers/listeners is associated with the shared interlanguage of Korean talkers and listeners.
Improvement of Rejection Performance using the Lip Image and the PSO-NCM Optimization in Noisy Environment
Kim, Byoung-Don ; Choi, Seung-Ho ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 65~70
Recently, audio-visual speech recognition (AVSR) has been studied to cope with noise problems in speech recognition. In this paper we propose a novel method of deciding weighting factors for audio-visual information fusion. We adopt the particle swarm optimization (PSO) to weighting factor determination. The AVSR experiments show that PSO-based normalized confidence measures (NCM) improve the rejection performance of mis-recognized words by 33%.
Speaker Identification Using an Ensemble of Feature Enhancement Methods
Yang, IL-Ho ; Kim, Min-Seok ; So, Byung-Min ; Kim, Myung-Jae ; Yu, Ha-Jin ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 71~78
In this paper, we propose an approach which constructs classifier ensembles of various channel compensation and feature enhancement methods. CMN and CMVN are used as channel compensation methods. PCA, kernel PCA, greedy kernel PCA, and kernel multimodal discriminant analysis are used as feature enhancement methods. The proposed ensemble system is constructed with the combination of 15 classifiers which include three channel compensation methods (including 'without compensation') and five feature enhancement methods (including 'without enhancement'). Experimental results show that the proposed ensemble system gives highest average speaker identification rate in various environments (channels, noises, and sessions).
Design and Implementation of a Usability Testing Tool for User-oriented Design of Command-and-Control Voice User Interfaces
Lee, Myeong-Ji ; Hong, Ki-Hyung ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 79~87
Recently, usability has become very important in voice user interface systems. In this paper, we have designed and implemented a wizard-of-oz (WOZ) usability testing tool for command-and-control voice user interfaces. We have proposed the VUIDML (Voice User Interface Design Markup Language) to design the usability test scenario of command-and-control voice interfaces in the early design stages. For highly satisfactory voice user interfaces, we have to select highly preferred voice commands and prompts. In VUIDML, we can specify possible prompt candidates. The WOZ usability testing tool can also be used to collect user-preferred voice commands and feedback from real users.
Implementation of Voice Source Simulator Using Simulink
Jo, Cheol-Woo ; Kim, Jae-Hee ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 89~96
In this paper, details of the design and implementation of a voice source simulator using Simulink and Matlab are discussed. This simulator is an implementation by model-based design concept. Voice sources can be analyzed and manipulated through various factors by choosing options from GUI input and selecting pre-defined blocks or user created ones. This kind of simulation tool can simplify the procedure of analyzing speech signals for various purposes such as voice quality analysis, pathological voice analysis, and speech coding. Also, basic analysis functions are supported to compare the original signal and the manipulated ones.
The Therapeutic Effects of
in Patients with Mutational Dysphonia
Kim, Seong-Tae ; Nam, Soon-Yuhl ;
Phonetics and Speech Sciences, volume 3, issue 2, 2011, Pages 99~105
The treatment for patients with mutational dysphonia typically is useful with vegetative phonation, but has not yet been studied. This study attempts to identify the effect of
using throat clearing and laughing in patients with mutational dysphonia. The study, which was designed by the author, included 26 patients aged from 14 to 32 years (mean: 18.7 years) who had been diagnosed with mutational dysphonia between January 2007 and June 2010. Voice therapy for these patients included
, ranging from two to seven sessions (mean: 3.8 sessions). Results were evaluated by videostroboscopy, perceptual evaluation of GRBAS scale, aerodynamic test, and acoustic analysis before and after therapy. Most patients could phonate with low pitch from the beginning and sustain with normal pitch sound in the last session. We had found that glottic gap reduced after therapy and anterior-posterior compression of superior laryngeal part at the first time, and these patients had complete closure of the glottis after treatment. The results of acoustic and aerodynamic measures after treatment indicated significant decreases in Fo, Jitter, Shimmer, SFF, and SPI, and increases in MPT, Psub, and vocal efficiency (p<.05).
may be a useful treatment method in managing mutational dysphonia. We can suggest this technique may be useful in improving the voice quality of other functional dysphonia having glottal chink or functional aphonia.