• Title/Summary/Keyword: voice classification

Search Result 149, Processing Time 0.027 seconds

Voice Classification of Trained Classic Singers (성악가의 성종 구분에 관한 문헌적 고찰)

  • Nam, Do-Hyun;Paik, Jae-Yeon;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.1
    • /
    • pp.56-61
    • /
    • 2007
  • Introduction: Actually classification of classic singers' voice depends on habitual judgment by voice teachers or voice trainer referring to vocal timbre, vocal range and vocal quality. Such judgments, however, may turn out to be incorrect because they are based on subjective opinions. Therefore, more objective methodology is required. Method: Foreign dissertations searched through Pub Med, along with foreign and domestic journals, were reviewed regard ing how singers' voice has been categorized. Results: Vocal range, vocal timbre, voice quality, fundamental frequency of habitual speaking, length of vocal tract, the length from cricoid cartilage to thyroid cartilage's thyroid notch and length of vocal fold, tone of passaggio as well as traditional approaches such as perceptual judgment used by professional singers have been used for categorize the voice classification. Conclusion: To optimize categorizing singers' voice, vocal range, vocal timbre, voice quality, fundamental frequency of habitual speaking, length of vocal tract, the length from cricoid cartilage to thyroid cartilage's thyroid notch and length of vocal fold, tone of passaggio may be totally recommended.

  • PDF

Differences in Speaking Fundamental Frequency for Voice Classification and Closed Quotient between Speaking and Singing (성종에 따른 발화 기본주파수와 발화 및 성악발성 시 성대접촉률의 차이 비교)

  • Nam, Do-Hyun;Choi, Hong-Shik
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.147-157
    • /
    • 2008
  • Habitual speaking fundamental frequency (sF0) plays an important role in determining the voice classification, which can be presented differently depending on the vocal fold length and language habits. The purpose of this study, therefore, was to compare the differences in sF0 for voice classification and closed quotient between speaking and singing. Seventeen singers (7 sopranos, 5 tenors, 5 baritones, mean age 25.1 years) with no evidence of vocal folds pathology were participated. sF0 and closed quotient (CQ) both in speaking and in singing (A3-A5 with soprano, A2-A4 with tenor and baritone) were measured using SPEAD program and electroglottography. No significant differences were observed for sF0 between tenor and baritone groups (p> 0.05). However, CQ in singing was significantly different among three groups (p< 0.05), but CQ in speaking was not (p> 0.05). Furthermore, CQ was significantly different with both soprano (p< 0.01) and tenor groups ((P= 0.02) whereas baritone group revealed there is no difference when compared between speaking and singing. No significant differences in sF0 between tenor and baritone participants may result from decision-making for voice classification by experience and should measure sF0 before determining the voice classification.

  • PDF

Voice Classification Algorithm for Sasang Constitution Using Support Vector Machine (SVM을 이용한 음성 사상체질 분류 알고리즘)

  • Kang, Jae-Hwan;Do, Jun-Hyeong;Kim, Jong-Yeol
    • Journal of Sasang Constitutional Medicine
    • /
    • v.22 no.1
    • /
    • pp.17-25
    • /
    • 2010
  • 1. Objectives: Voice diagnosis has been used to classify individuals into the Sasang constitution in SCM(Sasang Constitution Medicine) and to recognize his/her health condition in TKM(Traditional Korean Medicine). In this paper, we purposed a new speech classification algorithm for Sasang constitution. 2. Methods: This algorithm is based on the SVM(Support Vector Machine) technique, which is a classification method to classify two distinct groups by finding voluntary nonlinear boundary in vector space. It showed high performance in classification with a few numbers of trained data set. We designed for this algorithm using 3 SVM classifiers to classify into 4 groups, which are composed of 3 constitutional groups and additional indecision group. 3. Results: For the optimal performance, we found that 32.2% of the voice data were classified into three constitutional groups and 79.8% out of them were grouped correctly. 4. Conclusions: This new classification method including indecision group appears efficient compared to the standard classification algorithm which classifies only into 3 constitutional groups. We find that more thorough investigation on the voice features is required to improve the classification efficiency into Sasang constitution.

Correlation analysis of voice characteristics and speech feature parameters, and classification modeling using SVM algorithm (목소리 특성과 음성 특징 파라미터의 상관관계와 SVM을 이용한 특성 분류 모델링)

  • Park, Tae Sung;Kwon, Chul Hong
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.91-97
    • /
    • 2017
  • This study categorizes several voice characteristics by subjective listening assessment, and investigates correlation between voice characteristics and speech feature parameters. A model was developed to classify voice characteristics into the defined categories using SVM algorithm. To do this, we extracted various speech feature parameters from speech database for men in their 20s, and derived statistically significant parameters correlated with voice characteristics through ANOVA analysis. Then, these derived parameters were applied to the proposed SVM model. The experimental results showed that it is possible to obtain some speech feature parameters significantly correlated with the voice characteristics, and that the proposed model achieves the classification accuracies of 88.5% on average.

Pilot Study on the Classification for Sasangin by the Voice Analysis (음성분석에 의한 체질진단에 관한 연구)

  • Lee Eui-Ju;Song Kwang-Bin;Choi Hwan-Soo;Yoo Jung-Hee;Kwak Chang-Kyu;Sohn Eun-Hae;Koh Byung-Hee
    • The Journal of Korean Medicine
    • /
    • v.26 no.1 s.61
    • /
    • pp.93-102
    • /
    • 2005
  • Objective : This research was conducted to evaluate the method of sasangin classification by voice analysis, The 2 pilot tests were thus designed to solve the following problems: 'What are the conditions at classification for sasangin by the voice analysis?' and 'What are the important variances of /a/ parameter?'. Methods: 122 volunteers Were examined to make a diagnosis of sasangin by QSCC II and they were disease-free and healthy, First, they said /a/ three times for 2 seconds in their usual voice, Second, they said /a/ for 2 seconds by the different ways of high tone, mid tone, and low tone. The sounds were collected by a recording program (cooledit 2000) through a Sony microphone (ecm-26l). We analyzed the voices by maltlab, the simulation tool. Results: There were no differences and were correlations when one said /a/ three times for 2 seconds in the usual voice. There were some things to correlate when one said /a/ three times for 2 seconds by the different ways of high speech, usual speech, and low speech. Others were nothing to correlate. We evaluated the value of sasangin classification method by only /a/ voice analysis. The hit ratio was average $66.3\%\;:\;soyangin\;67.9\%,\;taeumin\;68.0\%,\;soeumin\;63.9\%$. Conclusion: We must set up the conditions to use the method of sasangin classification by voice analysis. The value of sasangin classification method by only fa! voice analysis was a hit ratio of $66.3\%$.

  • PDF

Speech Enhancement Based on Voice/Unvoice Classification (유성음/무성음 분리를 이용한 잡음처리)

  • 유창동
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.374-379
    • /
    • 2002
  • In this paper, a nobel method to reduce noise using voice/unvoice classification is proposed. Voice and unvoice are an important feature of speech and the proposed method processes noisy speech differently for each voice/unvoice part. Speech is classified into voice/unvoice using zero-crossing rate and energy, and a modified speech/noise dominant-decision is proposed based on voice/unvoice classification. The proposed method was tested on conditions of white noise and airplane noise, and on the basis of comparing segmental SNR with the existing method and listening to the enhanced speech, a performance of the proposed method was superior to that of the existing method.

Performance Improvement of Classification Between Pathological and Normal Voice Using HOS Parameter (HOS 특징 벡터를 이용한 장애 음성 분류 성능의 향상)

  • Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
    • MALSORI
    • /
    • no.66
    • /
    • pp.61-72
    • /
    • 2008
  • This paper proposes a method to improve pathological and normal voice classification performance by combining multiple features such as auditory-based and higher-order features. Their performances are measured by Gaussian mixture models (GMMs) and linear discriminant analysis (LDA). The combination of multiple features proposed by the frame-based LDA method is shown to be an effective method for pathological and normal voice classification, with a 87.0% classification rate. This is a noticeable improvement of 17.72% compared to the MFCC-based GMM algorithm in terms of error reduction.

  • PDF

Gender Classification of Speakers Using SVM

  • Han, Sun-Hee;Cho, Kyu-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.10
    • /
    • pp.59-66
    • /
    • 2022
  • This research conducted a study classifying gender of speakers by analyzing feature vectors extracted from the voice data. The study provides convenience in automatically recognizing gender of customers without manual classification process when they request any service via voice such as phone call. Furthermore, it is significant that this study can analyze frequently requested services for each gender after gender classification using a learning model and offer customized recommendation services according to the analysis. Based on the voice data of males and females excluding blank spaces, the study extracts feature vectors from each data using MFCC(Mel Frequency Cepstral Coefficient) and utilizes SVM(Support Vector Machine) models to conduct machine learning. As a result of gender classification of voice data using a learning model, the gender recognition rate was 94%.

Qualitative Classification of Voice Quality of Normal Speech and Derivation of its Correlation with Speech Features (정상 음성의 목소리 특성의 정성적 분류와 음성 특징과의 상관관계 도출)

  • Kim, Jungin;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.71-76
    • /
    • 2014
  • In this paper voice quality of normal speech is qualitatively classified by five components of breathy, creaky, rough, nasal, and thin/thick voice. To determine whether a correlation exists between a subjective measure of voice and an objective measure of voice, each voice is perceptually evaluated using the 1/2/3 scale by speech processing specialists and acoustically analyzed using speech analysis tools such as the Praat, MDVP, and VoiceSauce. The speech parameters include features related to speech source and vocal tract filter. Statistical analysis uses a two-independent-samples non-parametric test. Experimental results show that statistical analysis identified a significant correlation between the speech feature parameters and the components of voice quality.

Detection of Pathological Voice Using Linear Discriminant Analysis

  • Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
    • MALSORI
    • /
    • no.64
    • /
    • pp.77-88
    • /
    • 2007
  • Nowadays, mel-frequency cesptral coefficients (MFCCs) and Gaussian mixture models (GMMs) are used for the pathological voice detection. This paper suggests a method to improve the performance of the pathological/normal voice classification based on the MFCC-based GMM. We analyze the characteristics of the mel frequency-based filterbank energies using the fisher discriminant ratio (FDR). And the feature vectors through the linear discriminant analysis (LDA) transformation of the filterbank energies (FBE) and the MFCCs are implemented. An accuracy is measured by the GMM classifier. This paper shows that the FBE LDA-based GMM is a sufficiently distinct method for the pathological/normal voice classification, with a 96.6% classification performance rate. The proposed method shows better performance than the MFCC-based GMM with noticeable improvement of 54.05% in terms of error reduction.

  • PDF