Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Phonetics and Speech Sciences
Journal Basic Information
Journal DOI :
The Korean Society of Speech Sciences
Editor in Chief :
Volume & Issues
Volume 8, Issue 2 - Jun 2016
Volume 8, Issue 1 - Mar 2016
Selecting the target year
A study on the features of English as a lingua franca in Asian contexts: Rhythmic features
Chung, Hyunsong ; Lee, Sang-Ki ; Kim, Yoon-Kyu ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 1~9
DOI : 10.13064/KSSS.2016.8.2.001
This paper investigated the rhythmic features of speakers of English as a lingua franca in Asian contexts. A speech corpus of 150 conversations between speakers of English in Asia with different L1 backgrounds was collected and %V,
, VarcoV, and nPVI-V of each speaker were analyzed. It was found that L1 difference of the speakers and the speakers' daily use of English influenced %V, while the speakers' daily use of English influenced
. The gender difference of the speakers also affected the rhythm of the utterances in VarcoV. A weak correlation between the two speakers' rhythm in each conversation was also found in %V and
. No significant effects were found in nPVI-V. The results revealed that the speakers tended to accommodate the rhythm of their utterance to that of the interlocutors'. Further study on the speaking rate of the speakers is required to overcome some inconsistencies found in the results of the rhythmic metrics used in this study.
Phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary
Yang, Byunggon ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 11~16
DOI : 10.13064/KSSS.2016.8.2.011
This study explores the phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary to provide phoneticians and linguists with fundamental phonetic data on English word components. Entry words in the dictionary file were syllabified using an R script and examined to obtain the following results: First, English words preferred consonants to vowels in their word components. In addition, monophthongs occurred much more frequently than diphthongs. When all consonants were categorized by manner and place, the distribution indicated the frequency order of stops, fricatives, and nasals according to manner and that of alveolars, bilabials and velars according to place. These results were comparable to the results obtained from the Buckeye Corpus (Yang, 2012). Second, from the analysis of syllable structure, two-syllable words were most favored, followed by three- and one-syllable words. Of the words in the dictionary, 92.7% consisted of one, two or three syllables. This result may be related to human memory or decoding time. Third, the English words tended to exhibit discord between onset and coda consonants and between adjacent vowels. Dissimilarity between the last onset and the first coda was found in 93.3% of the syllables, while 91.6% of the adjacent vowels were different. From the results above, the author concludes that an analysis of the phonetic symbols in a dictionary may lead to a deeper understanding of English word structures and components.
An SVM-based physical fatigue diagnostic model using speech features
Kim, Tae Hun ; Kwon, Chul Hong ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 17~22
DOI : 10.13064/KSSS.2016.8.2.017
This paper devises a model to diagnose physical fatigue using speech features. This paper presents a machine learning method through an SVM algorithm using the various feature parameters. The parameters used include the significant speech parameters, questionnaire responses, and bio-signal parameters obtained before and after the experiment imposing the fatigue. The results showed that performance rates of 95%, 100%, and 90%, respectively, were observed from the proposed model using three types of the parameters relevant to the fatigue. These results suggest that the method proposed in this study can be used as the physical fatigue diagnostic model, and that fatigue can be easily diagnosed by speech technology.
Implementation of CNN in the view of mini-batch DNN training for efficient second order optimization
Song, Hwa Jeon ; Jung, Ho Young ; Park, Jeon Gue ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 23~30
DOI : 10.13064/KSSS.2016.8.2.023
This paper describes some implementation schemes of CNN in view of mini-batch DNN training for efficient second order optimization. This uses same procedure updating parameters of DNN to train parameters of CNN by simply arranging an input image as a sequence of local patches, which is actually equivalent with mini-batch DNN training. Through this conversion, second order optimization providing higher performance can be simply conducted to train the parameters of CNN. In both results of image recognition on MNIST DB and syllable automatic speech recognition, our proposed scheme for CNN implementation shows better performance than one based on DNN.
Emergency dispatching based on automatic speech recognition
Lee, Kyuwhan ; Chung, Jio ; Shin, Daejin ; Chung, Minhwa ; Kang, Kyunghee ; Jang, Yunhee ; Jang, Kyungho ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 31~39
DOI : 10.13064/KSSS.2016.8.2.031
In emergency dispatching at 119 Command & Dispatch Center, some inconsistencies between the 'standard emergency aid system' and 'dispatch protocol,' which are both mandatory to follow, cause inefficiency in the dispatcher's performance. If an emergency dispatch system uses automatic speech recognition (ASR) to process the dispatcher's protocol speech during the case registration, it instantly extracts and provides the required information specified in the 'standard emergency aid system,' making the rescue command more efficient. For this purpose, we have developed a Korean large vocabulary continuous speech recognition system for 400,000 words to be used for the emergency dispatch system. The 400,000 words include vocabulary from news, SNS, blogs and emergency rescue domains. Acoustic model is constructed by using 1,300 hours of telephone call (8 kHz) speech, whereas language model is constructed by using 13 GB text corpus. From the transcribed corpus of 6,600 real telephone calls, call logs with emergency rescue command class and identified major symptom are extracted in connection with the rescue activity log and National Emergency Department Information System (NEDIS). ASR is applied to emergency dispatcher's repetition utterances about the patient information. Based on the Levenshtein distance between the ASR result and the template information, the emergency patient information is extracted. Experimental results show that 9.15% Word Error Rate of the speech recognition performance and 95.8% of emergency response detection performance are obtained for the emergency dispatch system.
The relationship among articulation rate, intelligibility and working memory in children with spastic and flaccid dysarthria
Jeong, Pil Yeon ; Sim, Hyun Sub ; Jeong, Sook Hwae ; Yim, Dongsun ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 41~48
DOI : 10.13064/KSSS.2016.8.2.041
The purpose of this study is to evaluate the association among articulation rate, speech intelligibility and working memory in children with dysarthria. Two subject groups of 11 spastic and 11 flaccid dysarthria, respectively, aged between 8 and 17 years of age participated in this study. All participants were administered the following tests: K-WISC III PIQ test, speech intelligibility, working memory and articulation rate. Group differences were compared by an independent t-test. Pearson correlation were computed between all measures. The results of this study are as follows: First, articulation rate and intelligibility were significantly lower for the spastic dysarthria than for the flaccid dysarthria. Second, there was a significant correlation between articulation rate and intelligibility in children with flaccid dysarthria. Lastly, there was no significant correlation between articulation rate and working memory in both groups. The results suggest that articulation rate is not necessarily accompanied by working memory capacity in children with dysarthria, and there are differences in the effect of articulation rate on intelligibility depending on the type of dysarthria.
The awareness of parents and teachers in the psycho- and voice behavioral characteristics related to children's voice problems
Song, Kyung Hwa ; Kim, Jaeock ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 49~56
DOI : 10.13064/KSSS.2016.8.2.049
The study examined that parents and teachers were aware of what extents behavioral characteristics were related to the children's voice problems. The voice samples of 89 children in the ages of 3 to 5 were collected and their voice quality were graded by G scale of GRBAS. The parents and teachers of the children were asked to complete the questionnaire composed of the pediatric Voice Handicap Index (pVHI) and the psycho- and voice behavioral characteristics of their children. The results are as follows. First, there were no significant differences in both pVHI and behavioral characteristics of their children by G scale. However, significant differences were shown in the behavioral characteristics between parents and teachers, but no difference in pVHI between them. In addition, there was a significant correlation between the psycho-behavioral characteristics and the voice behavioral characteristics in both parents and teachers. These results represent that parents and teachers are not aware of the presence of their children's voice problems and such voice problems are affected by behavioral characteristics associated with the use of voice.
Effects of Lax Vox voice therapy in a patient with spasmodic dysphonia: A case report
Lim, Hye Jin ; Choi, Seong Hee ; Kim, Jeong Kyu ; Choi, Chul-Hee ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 57~63
DOI : 10.13064/KSSS.2016.8.2.057
Recently, the Lax Vox voice therapy has been used as one of the SOVTE(Semi-Occluded Vocal Tracts Exercise). The purpose of this study was to explore the effect of Lax Vox voice therapy for a patient with Spasmodic dysphonia on voice improvement. One female spasmodic dysphonia patient(age=27) who had been diagnosed by a laryngologist received Lax Vox voice therapy. The Lax Vox protocol was configured as 5 steps (1 warm-up and 4 steps : bubbling without / with phonation/ gliding with phonation/ generalization) in this study. A total of 11 sessions were performed by a certified speech language pathologist. The present study evaluated the acoustic, aerodynamic, auditory perceptual, and patient's self-rating between pre-, mid-, and post- voice therapy. All objective and subjective parameters were improved after voice therapy; Reduced frequency variation, increased maximum phonation time, enlarged voice range, improved 'G' and 'S' in GRBAS & USDRS, and reduced VHI were observed. Especially, decreased
and remarkably reduced voice tremor were also demonstrated following Lax Vox voice therapy. Accordingly, Lax Vox voice therapy technique can be useful for improving voice and quality of life in patients with spasmodic dysphonia.
Prosodic pattern of the children with high-functioning autism spectrum disorder according to sentence type
Shin, Hee Baek ; Choi, Jieun ; Lee, YoonKyoung ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 65~71
DOI : 10.13064/KSSS.2016.8.2.065
The purpose of this study is to examine the prosodic pattern of the children with high functioning autism spectrum disorder(HFASD) according to sentence type. The participants were 18 children aged from 7 - 9 years; 9 children with HFASD and 9 typical development children(TD) of the same chronological age with HFASD children. Sentence reading tasks were conducted in this study. Seven interrogative sentences and 7 declarative sentences were presented to the participants and were asked to read the sentences three times. Mean values of F0, F0 range, intensity, speech rate and pitch contour were measured for each sentence. The results showed that for F0 range, significant main effect and interaction effect were observed in the subject group and sentence type. There were significant differences in intensity, mean F0, speech rate, pitch contour across sentence types. The results of this study indicated that HFASD showed no difference in intonation across sentence types. Speakers' intention may have a negative effect on pragmatic aspects. These results suggest that the assessment and intervention of prosody be important for HFASD.
Cepstral and spectral analysis of voices with adductor spasmodic dysphonia
Shim, Hee Jeong ; Jung, Hun ; Lee, Sue Ann ; Choi, Byung Heun ; Heo, Jeong Hwa ; Ko, Do-Heung ;
Phonetics and Speech Sciences, volume 8, issue 2, 2016, Pages 73~80
DOI : 10.13064/KSSS.2016.8.2.073
The purpose of this study was to analyze perceptual and spectral/cepstral measurements in patients with adductor spasmodic dysphonia(ADSD). Sixty participants with gender and age matched individuals(30 ADSD and 30 controls) were recorded in reading a sentence and sustained the vowel /a/. Acoustic data were analyzed acoustically by measuring CPP, L/H ratio, mean CPP F0 and CSID, and auditory-perceptual ratings were measured using GRBAS. The main results can be summarized as below: (a) the CSID for the connected speech was significantly higher than for the sustained vowel (b) the G, R and S for the connected speech were significantly higher than for the sustained vowel (c) Spectral/cepstral parameters were significantly correlated with the perceptual parameters, and (d) the ROC analysis showed that the threshold of 13.491 for the CSID achieved a good classification for ADSD, with 86.7% sensitivity and 96.7% specificity. Spectral and cepstral analysis for the connected speech is especially meaningful on cases where perceptual analysis and clinical evaluation alone are insufficient.