Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Phonetics and Speech Sciences
Journal Basic Information
Journal DOI :
The Korean Society of Speech Sciences
Editor in Chief :
Volume & Issues
Volume 4, Issue 4 - Dec 2012
Volume 4, Issue 3 - Sep 2012
Volume 4, Issue 2 - Jun 2012
Volume 4, Issue 1 - Mar 2012
Selecting the target year
The phonetic realization of English unstressed vowels produced by Korean advanced learners : A comparative study of English words and English loanwords
Kang, Sun-Mi ; Kang, Ji-Eun ; Kim, Kee-Ho ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 3~11
DOI : 10.13064/KSSS.2012.4.1.003
The aim of this paper is to examine the phonetic realizations of English unstressed vowels produced by advanced Korean learners (KLs) of English compared with English native speakers (NSs) focusing on the comparative study of English words and English loanwords. The result shows that KLs are usually not native-like in producing the English unstressed vowel /ə/ and loanword orthography affects the way the KLs produce /?/. The vowel quality of the unstressed vowels produced by the KLs is different from that of the NSs. In duration and pitch, KLs show significantly less difference between the stressed and unstressed vowels than do the NSs. The KLs usually have a high pitch in the stressed and the last syllable while the NSs usually produce peak F0 in the stressed syllable. When the KLs have a similar vowel quality with that of the NSs, they produce a shorter duration of the unstressed vowels. However, there is no correlation between the realization of the pitch and the vowel quality in KLs speech.
Perception of Korean stops with a three-way laryngeal contrast
Kong, Eun-Jong ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 13~20
DOI : 10.13064/KSSS.2012.4.1.013
A lax stop in Korean, one of the three laryngeal contrastive stops, has undergone a sound change in terms of its acoustic properties. Prior production studies described this recent lax stop as being differentiated from tense and aspirated stops primarily by fundamental frequencies (f0). And, the acoustic property of voice onset time (VOT) further separates tense stops from lax and aspirated stops. The current research explores how these two major acoustic parameters of f0 and VOT cue the three stop categories in Korean adult listeners' perception. Thirty-one native speakers of Korean participated in two experimental tasks: categorization judgment and within-category goodness ratings. Two sets of audio stimuli were prepared by synthesizing English and Korean male speakers' CV productions. The findings showed that while f0 cues listeners to lax stops as production patterns would predict, VOT were closely related to listeners' categorization and goodness ratings of lax stops. This suggests that accurate characterizations of the recent lax stop category need to be based on Korean speakers' perceptual behavior as well as production patterns.
Pitch and Formant Trajectories of English Vowels by American Males with Different Speaking Styles
Yang, Byung-Gon ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 21~28
DOI : 10.13064/KSSS.2012.4.1.021
Many previous studies reported acoustic parameters of English vowels produced by a clear speaking style. In everyday usage, we actually produce speech sounds with various speaking styles. Different styles may yield different acoustic measurements. This study attempts to examine pitch and formant trajectories of eleven English vowels produced by nine American males in order to understand acoustic variations depending on clear and conversational speaking styles. The author used Praat to obtain trajectories systematically at seven equidistant time points over the vowel segment while checking measurement validity. Results showed that pitch trajectories indicated distinct patterns depending on four speaking styles. Generally, higher pitch values were observed in the higher vowels and the pitch was higher in the clear speaking styles than that in the conversational styles. The same trend was observed in the three formant trajectories of front vowels and the first formant trajectories of back vowels. The second and third trajectories of back vowels revealed an opposite or inconsistent trend, which might be attributable to the coarticulation of the following consonant or lip rounding gestures. The author made a tentative conclusion that people tend to produce vowels to enhance pitch and formant differences to transmit their information clearly. Further perceptual studies on synthesized vowels with varying pitch and formant values are desirable to address the conclusion.
Error Correction and Praat Script Tools for the Buckeye Corpus of Conversational Speech
Yoon, Kyu-Chul ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 29~47
DOI : 10.13064/KSSS.2012.4.1.029
The purpose of this paper is to show how to convert the label files of the Buckeye Corpus of Spontaneous Speech  into Praat format and to introduce some of the Praat scripts that will enable linguists to study various aspects of spoken American English present in the corpus. During the conversion process, several types of errors were identified and corrected either manually or automatically by the use of scripts. The Praat script tools that have been developed can help extract from the corpus massive amounts of phonetic measures such as the VOT of plosives, the formants of vowels, word frequency information and speech rates that span several consecutive words. The script tools can extract additional information concerning the phonetic environment of the target words or allophones.
Acoustic Characteristics of Korean Compounds and Phrases
Yi, So-Pae ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 49~54
DOI : 10.13064/KSSS.2012.4.1.049
Recent studies on acoustic correlates of stress in English compounds and English phrases have revealed the difference of changes in acoustic manifestation between English compounds and English phrases with different intonation patterns. However, little effort has been made to compare Korean compounds and Korean phrases in different intonational environments. Therefore, this study focuses on the analysis of acoustic characteristics of Korean compounds and Korean phrases produced in different intonational sentence patterns (Subject, Question, Clause-Final, and Statement-Final). Measurements of vowel duration, intensity (dB) and pitch (in semitones) were compared. The results of the experiment in which 30 native speakers of Korean pronounced Korean compounds and Korean phrases (obtained from
sentences) in controlled prosodic and intonational environments reveal clear patterns that distinguish Korean compounds from Korean phrases and support the evidence of acoustic salience for phrases. Duration differences turned out to be a significant cue to distinguish Korean compounds and Korean phrases in all but the Clause Final position. According to the size effect, duration ratio is the most reliable cue to distinguish Korean compounds and Korean phrases followed by the pitch differences between the first syllable and the second syllable and the intensity ratio. Implications for Korean and English intonation training were also discussed.
Prosodic Disambiguation of Low versus High Syntactic Attachment across Lexical Biases in English
Jeon, Yoon-Shil ; Yoon, Kyu-Chul ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 55~65
DOI : 10.13064/KSSS.2012.4.1.055
In this study, the prosodic disambiguation of the syntactic attachment differences was investigated in relation to the effect of lexical bias. Speech materials were composed of N1-conj-N2-PP phrases such as "walkers and runners with dogs." The results show that the use of durational pattern is dominant over the pitch pattern to differentiate the attachment differences. The characteristic pitch contour was the rise and fall over N1 and N2 in the high attachment. The pitch contour in the low attachment was the rise and fall over N2 and N3 although the frequency of such patterns was lower for the low attachment case. For the durational pattern, the lengthening in the N2 region plays a significant role in the disambiguation of the syntactic attachments. The interaction between the lexical bias and the syntactic attachment was not statistically significant in the duration data.
The acoustic realization of the Korean sibilant fricative contrast in Seoul and Daegu
Holliday, Jeffrey J. ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 67~74
DOI : 10.13064/KSSS.2012.4.1.067
The neutralization of /
/ and /
/ in Gyeongsang dialects is a culturally salient stereotype that has received relatively little attention in the phonetic literature. The current study is a more extensive acoustic comparison of the sibilant fricative productions of Seoul and Gyeongsang dialect speakers. The data presented here suggest that, at least for young Seoul and Daegu speakers, there are few inter-dialectal differences in sibilant fricative production. These conclusions are supported by the output of mixed effects logistic regression models that used aspiration duration, spectral mean of the frication noise, and H1-H2 of the following vowel to predict fricative type in each dialect. The clearest dialect difference was that Daegu speakers' /
/ and /
/ productions had overall shorter aspiration durations than those of Seoul speakers, suggesting the opposite of the traditional "/
/ produced as [
]" stereotype of Gyeongsang dialects. Further work is needed to investigate whether /
/ neutralization in Daegu is perceptual rather than acoustic in nature.
Music Recognition Using Audio Fingerprint: A Survey
Lee, Dong-Hyun ; Lim, Min-Kyu ; Kim, Ji-Hwan ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 77~87
DOI : 10.13064/KSSS.2012.4.1.077
Interest in music recognition has been growing dramatically after NHN and Daum released their mobile applications for music recognition in 2010. Methods in music recognition based on audio analysis fall into two categories: music recognition using audio fingerprint and Query-by-Singing/Humming (QBSH). While music recognition using audio fingerprint receives music as its input, QBSH involves taking a user-hummed melody. In this paper, research trends are described for music recognition using audio fingerprint, focusing on two methods: one based on fingerprint generation using energy difference between consecutive bands and the other based on hash key generation between peak points. Details presented in the representative papers of each method are introduced.
Vocal acoustic characteristics of speakers with depression
Baek, Yeon-Sook ; Kim, Se-Joo ; Kim, Eun-Yeon ; Choi, Yae-Lin ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 91~98
DOI : 10.13064/KSSS.2012.4.1.091
The purposes of this paper is to study the characteristics of compared to the speakers voice without depression and speakers with depression, and to propose a objective method for the measurement of the therapeutic effects as well as for diagnostics of depression based on the characteristics. The voice samples obtained from 11 female speakers with depression, aged from 20 to 40, diagnosed as having major depressive disorder by an psychiatrist were compared with those from 12 normal controls with matched sex, age, height, weight, education, smoking, and drinking. The voice samples are taken by a portable digital recorder(TASCAM DR-07, Japan) and analysed using the MDVP(Multi-Dimentional Voice Program) software module from CSL(Computerized Speech Lab, kay elemetrics, co, model 4100). The result of the investigation are as following. First, the average speaking fundamental frequency and loudness range of the speakers with depression group was statistically significantly lower than that of the control group. The pitch range of the control group was rather higher than that of the speakers with depression group, but without statistical significance. Overall speech rates have no statistical difference between two groups. Second, the average speaking fundamental frequency and loudness range have statistically significant negative correlation with Beck Depression Inventory, i. e. more severe depression exhibits lower average speaking fundamental frequency and loudness range. Other vocal parameters such as pitch range and overall speech rate have no statistically meaningful correlations with Beck Depression Inventory.
Gender Differences in Risk Factors of Self-reported Voice Problems
Byeon, Hae-Won ; Hwang, Young-Jin ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 99~108
DOI : 10.13064/KSSS.2012.4.1.099
Recent research has identified that self-reported voice problems are a risk indicator for voice disorders. However, previous studies concerning the general population did not take into account the influence of gender on self-reported voice problems. The purpose of the present cross-sectional study was to determine the gender differences in risk factors of self-reported voice problems in the Korean adult population using national survey data. This study utilized data from the Korea National Health and Nutritional Examination Survey 2008. Subjects inclued 3,622 people (1,508 male and 2,114 female) aged 19 years and older living in the community. Data were analyzed using t-test, one-way ANOVA, and multiple logistic regression. The prevalence of self-reported voice problems was 5.9% in males, and 8.1% in females Females had higher incidents of self-reported voice problems than males. Adjusting for covariates, in males, age (OR=2.47, 95% CI: 1.07-5.70), pain and discomfort during the last two weeks (OR=3.64, 95% CI: 2.20-6.01) were independently associated with self-reported voice problems (p<0.05). In women, age (OR=1.96, 95% CI: 1.18-3.26), education (OR=2.09, 95% CI: 1.06-4.12), smoking (OR=2.70, 95% CI: 1.48-4.93), thyroid disorders (OR=2.58, 95% CI: 1.47-4.53), pain and discomfort during the last two weeks (OR=1.75, 95% CI: 1.21-2.54) were independently associated with self-reported voice problem (p<0.05). Self-reported voice problems related risk factors differed according to gender. These findings suggest that there needs to be different program strategies that reflect gender differences in self-reported voice problems.
Variance characteristics of speaking fundamental frequency and vocal intensity depending on utterance conditions
Lee, Moo-Kyung ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 111~118
DOI : 10.13064/KSSS.2012.4.1.111
The purpose of this study was to characterize and determine variances of speaking fundamental frequency and vocal intensity depending on gender and three utterance conditions (spontaneous speech, reading, and counting). A total of 65 undergraduate students (32 male students, 33 female students) attending universities in Daegu, South Korea participated in this study. The subjects were all in their 20s. This study used KayPENTAX's Visi-Pitch IV (Model 3950) to measure the variances of speaking fundamental frequency (SFF0) and vocal intensity (VI). As a result, this study came to the following conclusions. First, it was found that both males and females showed no significant difference in SFF0 and vocal intensity among three utterance conditions. Second, this study sought to analyze differences in the variances of SFF0 between males and females. As a result, it was found that females showed significantly higher levels of four measured variances (SFF0
) than males on spontaneous speech. However, it was found that there was no significant difference between males and females in SFF0 range on reading or in SFF0 SD and SFF0 range on counting. It was found that there was no significant difference between males and females in the level of measured variances of vocal intensity depending on utterance conditions. Finally, this study made a comparison and analysis on differences in the variances of SFF0 and vocal intensity among utterance conditions. As a result, it was found that all the measured variances of SFF0 in males were most significantly reduced depending upon spontaneous speech which was followed by reading and counting respectively (SFF0 SD: p<.001, SFF0 range: p<.05, Max SFF0: p<.05). Females however, show no significant difference in the measured variances of SFF0 depending upon three utterance conditions. It was also found that the measured variances of vocal intensity in females were most significantly reduced depending on spontaneous speech that was followed by reading and counting (VI SD: p<.001, VI range: p<.001, Min VI: p<.01 Max VI: p<.05), while males showed no significant difference in the measured variances of vocal intensity depending on three utterance conditions. In sum, these findings suggest that variances of SFF0 in males are affected by three utterance conditions, while variances of vocal intensity in females are affected by three utterance conditions.
A comparison of the voice difference of persons with Idiopathic Parkinson's disease and a normal group in five vowels
Lee, In-Ae ; Kim, Moon-Jeoung ; Hwang, Young-Jin ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 119~124
DOI : 10.13064/KSSS.2012.4.1.119
The purpose of this study is to compare the voice differences of persons with Idiopathic Parkinson's disease and a normal group according to five vowels. Eight persons with Idiopathic Parkinson's disease and a healthy control group of 22 were selected and every voice analyzed by MDVP. The first result showed that jitter measurements between the two group showed a significant statistical difference according to all vowels. Second, the two groups' shimmer measurements showed a significant statistical difference according to nearly all vowels. Third, jitter measurements between the five vowels were more relatively closely correlated persons with Idiopathic Parkinson's disease than the normal group. Fourth, shimmer figures between the five vowels more relatively closely correlated persons with Idiopathic Parkinson's disease than the normal group.
Comparison of Maximum Phonation Time Associated with the Changes in Vocal Intensity in Patients with Unilateral Vocal Fold Palsy and Sulcus Vocalis
Choi, Se-Jin ; Choi, Hong-Shik ; Kim, Jae-Ock ; Choi, Yae-Lin ;
Phonetics and Speech Sciences, volume 4, issue 1, 2012, Pages 125~131
DOI : 10.13064/KSSS.2012.4.1.125
The patients with incomplete glottic closure have an important feature decreasing the maximum phonation time (MPT) because airflow rate or air leakage is greater than people without voice disorders. Also they can appear a problem in the intensity regulation. This study analyzed MPT difference based on the comfortable intensity and louder intensity and the correlation between MPT and respiration volume of unilateral vocal fold palsy (UVFP) and sulcus vocalis (SV) group. The twenty with UVFP, the 21 with SV, the 21 normal subjects measured MPT in /a/ vowel prolongation task with comfortable intensity and louder intensity and compared analysis by measuring FVC,
to analyze the correlation between MPT and respiration volume. First, a comparison of MPT according to the intensity between groups is that MPT of the normal group was statistically significant long compared to the patient group in comfortable intensity, but MPT between groups was not statistically significant difference in the louder intensity. Second, an analysis of the correlation between MPT and respiration volume is that this was statistically significant correlation between MPT in comfortable intensity and MPT in louder intensity. But this did not show statistically significant correlation between intensity and respiration volume. This study can be supported the preceding study results deduced that shorting MPT of the patient group compared to the normal group was originated in the problem of laryngeal valving mechanism at the level of vocal folds rather than a problem of respiratory function. Also at the phonation by varying the intensity, the result can deduce that in the case of patient group, the length of MPT had been improved by increasing the glottal closure ratio in the louder intensity. These results can support the theoretical basis that should be applied to the clinicians by varying the intensity at the voice evaluation and voice therapy for the patients with the glottis incompetence.