Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Phonetics and Speech Sciences
Journal Basic Information
Journal DOI :
The Korean Society of Speech Sciences
Editor in Chief :
Volume & Issues
Volume 6, Issue 4 - Dec 2014
Volume 6, Issue 3 - Sep 2014
Volume 6, Issue 2 - Jun 2014
Volume 6, Issue 1 - Mar 2014
Selecting the target year
Dysphagia Handicap Index and Swallowing Characteristics based on Laryngeal Functions in Korean Elderly
Kim, Geun-Hee ; Choi, Seong Hee ; Lee, Kyoung-Jae ; Choi, Chul-Hee ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 3~12
DOI : 10.13064/KSSS.2014.6.3.003
Larynx plays an important role in phonation and protection of the respiratory tract during swallowing. The reduced anatomical and physiological function in elevation of larynx and glottis closure can cause problems in voice and swallowing. The present study investigated the Korean version of handicap index of dysphagia in elderly Koreans. Therefore, 60 normal elderly Koreans ranged from 65 to 95 and 20 normal Korean young adults aged from 20 to 25 were participated in this study to compare total (T), physical (P), functional (F), and emotional (E) index scores between two groups as well as among sub groups (60s, 70s, 80s) in elderly. For swallowing, total and sub dysphagia handicap index (DHI) scores, voice quality during /a/phonation following swallowing (saliva and water), intensity of coughing, and L-DDK were measured. The results showed that functional (P), physical (P), emotional (E) scores as well as total (T) score were significantly different between young adults and old adults in DHI(p<.05). Additionally, there was a negative correlation between total DHI score and intensity of coughing (r=-.51) as well as L-DDK (r=-.70). These findings suggest that a slow rate in vocal fold adduction and reduced intensity of coughing in the elderly affect swallowing function. Thus, recently translated Korean version of DHI may be useful as supplement in evaluating the swallowing problems in elderly people.
Sentence interpretation strategies by typically developing and late-talking Korean toddlers
Jo, Sujung ; Hwang, Mina ; Choi, Kyung-Soon ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 13~21
DOI : 10.13064/KSSS.2014.6.3.013
Late talkers are young children who are delayed in their expressive language skills despite normal nonverbal cognitive ability, adequate hearing and typical personality development. The purpose of this study is to investigate the sentence interpretation strategies used by Korean-speaking late talkers and age-matched normal children. Nine late talkers and nine normal children matched by age at 30-35months were participated in this study. 27 simple noun-noun-verb(NNV) sentences were generated by factorial combination of case-marker [nominal case-marker on the first noun and accusative on the second (C1), accusative on the first noun and nominative on the second (C2), and no case markers on both nouns (C0)], and animacy of the nouns [animate-inanimate(AI), inanimate-animate(IA), animate-animate(AA)]. All the children were asked to "act out" their interpretation of the given sentence. For each type of sentences the percentage of choices of the first noun as the agent was calculated. The results of group (2)
case-marker(3) mixed ANOVA showed a significant main effect for 'animacy', 'case marker' and 'group(2)
case-marker (3)'. The late talkers relied on semantic (animacy) cues in their interpretation of the sentences, while the normal peers utilized both animacy and grammatical morpheme (case-marker) cues. The results indicated that the late-talkers' comprehension skills were also delayed.
The Prosodic Characteristics of Pre-school Age Children-Related Adults
Kim, Jiwon ; Seong, Cheoljae ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 23~32
DOI : 10.13064/KSSS.2014.6.3.023
This study presents the prosodic characteristics of 'Motherese' and 'Teacherese (child care teacher and kindergarten teacher)'. 21 mothers and 24 teachers spoke to children in the child care center or kindergarten. Children are in their 4;00-6;11. Speech and articulation rate, number of accentual phrases (APs), number of intonational phrases (IPs), pitch-related factors (f0, pitch range, f0 standard deviation), and intonation slope (mean Absolute, f0, q-tone slope) were measured. 2 groups spoke 2 sentential types (interrogative_ alternative question, declarative_ coordinated sentence) in 2 situations (one accompanied with the children, the other done without children, but pretending as if they were in front of the children). The results indicate that teachers show more noticeable prosodic characteristics than mothers do.
Alternating Motion Rate Characteristics in Children with Childhood Apraxia of Speech
Park, Junbeom ; Ha, Seunghee ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 33~40
DOI : 10.13064/KSSS.2014.6.3.033
The purpose of the study was to examine alternating motion rate and its variability in children with childhood apraxia of speech (CAS) compared to typically developing children. Six children with CAS aged 9-12 years old and 10 children who were age-matched participated in the study. This study measured tokens per second and variabilities of the rates during the production of /
a/, and /
a/. For variability measures of the rates, each participant was asked to repeat speech tasks three times and the average value of the rates and its standard deviation were obtained. The results revealed that the CAS group showed slower rate only at /
a/ than the control group. The CAS group exhibited greater variability of AMR at all the tasks than the control group. The results suggested that variability of AMR might be a more distinctive speech feature to children with CAS than the rate of the speech task.
The Correlation of Voice Characteristics and Depression Index Analysis in Accordance with Menstrual Cycle
Kim, YuMi ; Jang, Seoung-Jin ; Kim, Eunyeon ; Choi, Yaelin ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 41~48
DOI : 10.13064/KSSS.2014.6.3.041
This study investigated the differences between emotional parameters BDI, VHI, STAI-X-I and STAI-X-II according to the menstrual cycles of the female and the relation between changes of the depression index and voice characteristics (jitter, shimmer, CPP, HNR,
, sF0, sF4, sB1,
). Twenty three females (
years old) living in Seoul and Gyeonggi Province were participated in this study to answer the questionnaires and record their voice. The participants prolonged /a/ vowel for 5 seconds in a natural condition for their voice recording. Voice data were analyzed using the Matlab and Praat program. A t-test and a correlation analysis were conducted by using SPSS for the statistical analysis. The results are as follows. First, the BDI is significantly higher in group I (lurear phase contrast the menstrual period) and group II (follicular phase against the menstrual period) than group III (luteal phase for follicular phase) (p<.05). Second, shimmer, CPP, pF0 showed a statistically high correlation regarding the BDI in group I (lurear phase contrast the menstrual period). Voice parameters may be useful as supplement in evaluating the emotional change in the phase of menstrual cycle.
Characteristics of respiration and phonation depending on smoking or non smoking by practical musicology students and general male students
Kim, Eunhye ; Choi, Hong-Shik ; Lim, Seong-Eun ; Choi, Yaelin ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 49~56
DOI : 10.13064/KSSS.2014.6.3.049
This research compared the features of respiration and phonation between practical musicology students and general male students, according to their smoking status. Participants of this research are 15 practical musicology male students attending
university and 16 general
university students. The participants, both non-smokers and smokers with 5-years of smoking history have no history of voice disease in any case and have normal cognitive functions. The results indicated that, first, there is not a notable difference in the respiratory activity status(FVC, FEV1, FEV1/FVC), regardless of major and smoking status. In MPT, even though there is no significant difference in accordance with their majors, considering smoking status, the smoker group was shorter than non-smoker group significant difference statistically (p<.01). Second, the divisions of participants' major did not show significant difference in Fo, jitter, shimmer, and NHR in the vowel prolongation task. However, the smoker group showed a significantly higher degree of jitter and shimmer than the non-smoker group (p<.05) as Fo and NHR shows no difference. In the case of VRP, maximum frequency and frequency range of the practical group are significantly higher than normal group statistically (p<.001). Moreover, although the difference of the minimum frequency shown at the statistic is not significant, practical group showed a higher tendency of frequency than normal group (p=.051). In conclusion, even though there is no difference in respiratory activity between the smoker group and non-smoker group, the MPT of the smoker group is shorter than that of non-smoker group. In addition, the smoker group showed a higher degree of jitter and shimmer than the non-smoker group. MPT is related to the valve action of vocal fold that passes through the glottis. Thus, it is interpreted that the smoker group has a lower quality of voice and valve action of the vocal fold. Also, the practical group has a higher degree of maximum frequency and frequency range than the normal group. This research can function as basic data for vocal characteristics for the majors in relation to the voice-specializing.
Aerodynamic Characteristics of Whispered and Normal Speech during Reading Paragraph Tasks
Pyo, Hwayoung ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 57~62
DOI : 10.13064/KSSS.2014.6.3.057
The present study was performed to investigate and discuss the aerodynamic characteristics of whispered and normal speech during reading paragraph tasks. 39 normal females(18-23 yrs.) read 'Autumn' paragraph with whispered and normal phonation. Their readings were recorded and analyzed by 'Running Speech' in Phonatory Aerodynamic System(PAS) instrument. As results, during whispered speech, the total duration was longer and the numbers of inspiration were more frequently shown than normal speech. The Peak expiratory and inspiratory rate were higher in normal speech, but the expiratory and inspiratory volume were higher in whispered speech. By correlation analysis, both whispered and normal speech showed significantly high correlation between total duration and expiratory/inspiratory airflow duration; numbers of inspiration and inspiratory airflow duration; expiratory and inspiratory volume. These results show that whispered speech needs more respiratory effort but shows poorer aerodynamic efficacy during phonation than normal speech.
The Utility of Perturbation, Non-linear dynamic, and Cepstrum measures of dysphonia according to Signal Typing
Choi, Seong Hee ; Choi, Chul-Hee ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 63~72
DOI : 10.13064/KSSS.2014.6.3.063
The current study assessed the utility of acoustic analyses the most commonly used in routine clinical voice assessment including perturbation, nonlinear dynamic analysis, and Spectral/Cepstrum analysis based on signal typing of dysphonic voices and investigated their applicability of clinical acoustic analysis methods. A total of 70 dysphonic voice samples were classified with signal typing using narrowband spectrogram. Traditional parameters of %jitter, %shimmer, and signal-to-noise ratio were calculated for the signals using TF32 and correlation dimension(D2) of nonlinear dynamic parameter and spectral/cepstral measures including mean CPP, CPP_sd, CPPf0, CPPf0_sd, L/H ratio, and L/H ratio_sd were also calculated with ADSV(Analysis of Dysphonia in Speech and VoiceTM). Auditory perceptual analysis was performed by two blinded speech-language pathologists with GRBAS. The results showed that nearly periodic Type 1 signals were all functional dysphonia and Type 4 signals were comprised of neurogenic and organic voice disorders. Only Type 1 voice signals were reliable for perturbation analysis in this study. Significant signal typing-related differences were found in all acoustic and auditory-perceptual measures. SNR, CPP, L/H ratio values for Type 4 were significantly lower than those of other voice signals and significant higher %jitter, %shimmer were observed in Type 4 voice signals(p<.001). Additionally, with increase of signal type, D2 values significantly increased and more complex and nonlinear patterns were represented. Nevertheless, voice signals with highly noise component associated with breathiness were not able to obtain D2. In particular, CPP, was highly sensitive with voice quality 'G', 'R', 'B' than any other acoustic measures. Thus, Spectral and cepstral analyses may be applied for more severe dysphonic voices such as Type 4 signals and CPP can be more accurate and predictive acoustic marker in measuring voice quality and severity in dysphonia.
Preliminary study of the perceptual and acoustic analysis on the speech rate of normal adult: Focusing the differences of the speech rate according to the area
Lee, Hyun-Joung ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 73~77
DOI : 10.13064/KSSS.2014.6.3.073
The purpose of this study is to investigate the differences of the speech rate according to the area in the perceptual and acoustic analysis. This study examines regional variation in overall speech rate and articulation rate across speaking situations (picture description, free conversation and story retelling) with 14 normal adult (7 in Gyeongnam and 7 in Honam area). The result of an experimental investigation shows that the perceptual speech rate differs significantly between two regional varieties of Koreans with a picture description examined here. A group of Honam speakers spoke significantly faster than a group of Gyeongnam speakers. However, the result of the acoustic analysis shows that the speech rate of the two groups did not differ. And there were significant regional differences in the overall speech rate and articulation rate on the other two speaking situation, free conversation and story retelling. It suggest that we have to study perceptual evaluation with regard to the free conversation and story retelling in future research, and based on the results of this study, a variety of researches on the speech rate will be needed on the various conditions, including various area and SLPs who have wider background and experiences. It is necessary for SLPs to train and experience more to assess patients properly and reliably.
A Link between Perceived and Produced Vowel Spaces of Korean Learners of English
Yang, Byunggon ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 81~89
DOI : 10.13064/KSSS.2014.6.3.081
Korean English learners tend to have difficulty perceiving and producing English vowels. The purpose of this study is to examine a link between perceived and produced vowel spaces of Korean learners of English. Sixteen Korean male and female participants perceived two sets of English synthetic vowels on a computer monitor and rated their naturalness. The same participants produced English vowels in a carrier sentence with high and low pitch variation in a clear speaking mode. The author compared the perceived and produced vowel spaces in terms of the pitch and gender variables. Results showed that the perceived vowel spaces were not significantly different in either variables. Korean learners perceived the vowels similarly. They did not differentiate the tense-lax vowel pairs nor the low vowels. Secondly, the produced vowel spaces of the male and female groups showed a 25% difference which may have come from their physiological differences in the vocal tract length. Thirdly, the comparison of the perceived and produced vowel spaces revealed that although the vowel space patterns of the Korean male and female learners appeared similar, which may lead to a relative link between perception and production, statistical differences existed in some vowels because of the acoustical properties of the synthetic vowels, which may lead to an independent link. The author concluded that any comparison between the perceived and produced vowel space of nonnative speakers should be made cautiously. Further studies would be desirable to examine how Koreans would perceive different sets of synthetic vowels.
Korean ESL Learners' Perception of English Segments: a Cochlear Implant Simulation Study
Yim, Ae-Ri ; Kim, Dahee ; Rhee, Seok-Chae ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 91~99
DOI : 10.13064/KSSS.2014.6.3.091
Although it is well documented that patients with cochlear implant experience hearing difficulties when processing their first language, very little is known whether or not and to what extent cochlear implant patients recognize segments in a second language. This preliminary study examines how Korean learners of English identify English segments in a normal hearing and cochlear implant simulation conditions. Participants heard English vowels and consonants in the following three conditions: normal hearing condition, 12-channel noise vocoding with 0mm spectral shift, and 12-channel noise vocoding with 3mm spectral shift. Results confirmed that nonnative listeners could also retrieve spectral information from vocoded speech signal, as they recognized vowel features fairly accurately despite the vocoding. In contrast, the intelligibility of manner and place features of consonants was significantly decreased by vocoding. In addition, we found that spectral shift affected listeners' vowel recognition, probably because information regarding F1 is diminished by spectral shifting. Results suggest that patients with cochlear implant and normal hearing second language learners would experience different patterns of listening errors when processing their second language(s).
A Study on the Relation among English Speech Rate, Pitch and Stress by Korean Speakers
Kim, Ji-Eun ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 101~108
DOI : 10.13064/KSSS.2014.6.3.101
This study investigates the relation among pitch range differences, speech rate and realization of stress. To identify the realization of the stress, vowel formants and durational differences of stressed and unstressed vowels are measured. The Korean learners were asked to read a textbook passage which includes nine sentences. The major results indicate that: (1) Korean speakers' pitch range is less than 50% of the native speakers; (2) There is a significantly negative relation between high-low pitch range and speech rate; (3) The vowel qualities and durations of the stressed and unstressed vowels are related to the speech rate. But these are not related to the high-low pitch range.
Sound change of /o/ in modern Seoul Korean: Focused on relations with acoustic characteristics and perception
Igeta, Takako ; Sonu, Mee ; Arai, Takayuki ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 109~119
DOI : 10.13064/KSSS.2014.6.3.109
This article represents a first step in a large study aimed at elucidating the relationship between production and perception involved in sound change of /o/ in (Seoul) Korean. In this paper we present the results of a production study and a perception experiment. For the production study we examined vowel production data of 20 young adult speakers, measuring the first and second formants, then conducted a discriminant analysis based on those values. In terms of their F1-F2 values, the distribution of /o/ and /u/ were close, and even overlapping in some circumstances, which is consistent with the literature. This tendency was more apparent among the female speakers than the males. Moreover, with the females' distributions, /o/ was frequently categorized as /u/, suggesting that the direction of the sound change is indeed increasing from /o/ to /u/. Next, to investigate the effects of this proximity on perception, we used the production data of five randomly selected speakers from the production study as stimuli for a perception experiment in which 21 young adult native speakers of (Seoul) Korean performed a vowel identification task and provided a Goodness rating on a 5-point scale. We found that while rates of correctness were high, when these correctness scores were weighted by the Goodness rating, these "weighted correctness" scores were lower in some cases, indicating a degree of confusion in distinguishing between the two vowels.
Phonetic Realization of Aspiration of Stops in English /Cr/ and /sCr/ Clusters and their Syllable Structure at the Phonetic Level: a Comparison between Two Speaker Groups
Sohn, Hyang-Sook ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 121~130
DOI : 10.13064/KSSS.2014.6.3.121
This study investigates the acoustic property of aspiration realized in English voiceless stops of /Cr/ and /sCr/ clusters. VOT is measured from stops in these clusters produced by two groups; one from native speakers of English and the other from Korean native speakers. Aspiration of stops in different types of clusters is compared to various phonological factors such as location of stress, syllable type, and position in word. Pursuing the idea that phonetic realization is correlated with phonological representation, attempts are made to account for the gradient nature of aspiration of stops on the basis of syllable structure at the phonetic level, which may vary in the wake of resyllabification. Voiceless stops in /Cr/ and /sCr/ clusters are further compared to results obtained in the previous study on /sC/ cluster. Variations in aspiration are also characterized in terms of segmental precedence relation of stops in the clusters, namely, post-[s], pre-[r], or both.
An acoustical analysis of emotional speech using close-copy stylization of intonation curve
Yi, So Pae ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 131~138
DOI : 10.13064/KSSS.2014.6.3.131
A close-copy stylization of intonation curve was used for an acoustical analysis of emotional speech. For the analysis, 408 utterances of five emotions (happiness, anger, fear, neutral and sadness) were processed to extract acoustical feature values. The results show that certain pitch point features (pitch point movement time and pitch point distance within a sentence) and sentence level features (pitch range of a final pitch point, pitch range of a sentence and pitch slope of a sentence) are affected by emotions. Pitch point movement time, pitch point distance within a sentence and pitch slope of a sentence show no significant difference between male and female participants. The emotions with high arousal (happiness and anger) are consistently distinguished from the emotion with low arousal (sadness) in terms of these acoustical features. Emotions with higher arousal show steeper pitch slope of a sentence. They have steeper pitch slope at the end of a sentence. They also show wider pitch range of a sentence. The acoustical analysis in this study implies the possibility that the measurement of these acoustical features can be used to cluster and identify emotions of speech.
Monophthong Analysis on a Large-scale Speech Corpus of Read-Style Korean
Yoon, Tae-Jin ; Kang, Yoonjung ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 139~145
DOI : 10.13064/KSSS.2014.6.3.139
The paper describes methods of conducting vowel analysis from a large-scale corpus with the aids of forced alignment and optimal formant ceiling methods. 'Read Style Corpus of Standard Korean' is used for building the forced alignment system and a subset of the corpus for the processing and extraction of features for vowel analysis based on optimal formant ceiling. The results of the vowel analysis are reliable and comparable to the results obtained using traditional analytical methods. The findings indicate that the methods adopted for the analysis can be extended and be used for more fine-grained analysis without time-consuming manual labeling without losing accuracy and reliability.
Frequency Bin Alignment Using Covariance of Power Ratio of Separated Signals in Multi-channel FD-ICA
Quan, Xingri ; Bae, Keunsung ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 149~153
DOI : 10.13064/KSSS.2014.6.3.149
In frequency domain ICA, the frequency bin permutation problem falls off the quality of separated signals. In this paper, we propose a new algorithm to solve the frequency bin permutation problem using the covariance of power ratio of separated signals in multi-channel FD-ICA. It makes use of the continuity of the spectrum of speech signals to check if frequency bin permutation occurs in the separated signal using the power ratio of adjacent frequency bins. Experimental results have shown that the proposed method could fix the frequency bin permutation problem in the multi-channel FD-ICA.
Performance of Pseudomorpheme-Based Speech Recognition Units Obtained by Unsupervised Segmentation and Merging
Bang, Jeong-Uk ; Kwon, Oh-Wook ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 155~164
DOI : 10.13064/KSSS.2014.6.3.155
This paper proposes a new method to determine the recognition units for large vocabulary continuous speech recognition (LVCSR) in Korean by applying unsupervised segmentation and merging. In the proposed method, a text sentence is segmented into morphemes and position information is added to morphemes. Then submorpheme units are obtained by splitting the morpheme units through the maximization of posterior probability terms. The posterior probability terms are computed from the morpheme frequency distribution, the morpheme length distribution, and the morpheme frequency-of-frequency distribution. Finally, the recognition units are obtained by sequentially merging the submorpheme pair with the highest frequency. Computer experiments are conducted using a Korean LVCSR with a 100k word vocabulary and a trigram language model obtained by a 300 million eojeol (word phrase) corpus. The proposed method is shown to reduce the out-of-vocabulary rate to 1.8% and reduce the syllable error rate relatively by 14.0%.
Syllable-Type-Based Phoneme Weighting Techniques for Listening Intelligibility in Noisy Environments
Lee, Young Ho ; Joo, Jong Han ; Choi, Seung Ho ;
Phonetics and Speech Sciences, volume 6, issue 3, 2014, Pages 165~169
DOI : 10.13064/KSSS.2014.6.3.165
Intelligibility of speech transmitted to listeners can significantly be degraded in noisy environments such as in auditorium and in train station due to ambient noises. Noise-masked speech signal is hard to be recognized by listeners. Among the conventional methods to improve speech intelligibility, consonant-vowel intensity ratio (CVR) approach reinforces the powers of overall consonants. However, excessively reinforced consonant is not helpful in recognition. Furthermore, only some of consonants are improved by the CVR approach. In this paper, we propose the corrective weighting (CW) approach that reinforces the powers of consonants according to syllable-type such as consonant-vowel-consonant (CVC), consonant-vowel (CV) and vowel-consonant (VC) in Korean differently, considering the level of listeners' recognition. The proposed CW approach was evaluated by the subjective test, Comparison Category Rating (CCR) test of ITU-T P.800, showed better performance, that is, 0.18 and 0.24 higher than the unprocessed CVR approach, respectively.