Development of Age Classification Deep Learning Algorithm Using Korean Speech

So, Soonwon;You, Sung Min;Kim, Joo Young;An, Hyun Jun;Cho, Baek Hwan;Yook, Sunhyun;Kim, In Young;

doi:10.9718/JBER.2018.39.2.63

Journal of Biomedical Engineering Research (대한의용생체공학회:의공학회지)

Volume 39 Issue 2
/
Pages.63-68
/
2018
/
1229-0807(pISSN)
/
2288-9396(eISSN)

The Korean Society of Medical and Biological Engineering (대한의용생체공학회)

DOI QR Code

Development of Age Classification Deep Learning Algorithm Using Korean Speech

한국어 음성을 이용한 연령 분류 딥러닝 알고리즘 기술 개발

So, Soonwon (Department of Biomedical Engineering, Hanyang University) ;
You, Sung Min (Department of Biomedical Engineering, Hanyang University) ;
Kim, Joo Young (Department of Biomedical Engineering, Hanyang University) ;
An, Hyun Jun (Department of Biomedical Engineering, Hanyang University) ;
Cho, Baek Hwan (Department of Medical Device Management and Research, Sungkyunkwan University) ;
Yook, Sunhyun (Department of Biomedical Engineering, Hanyang University) ;
Kim, In Young (Department of Biomedical Engineering, Hanyang University)

소순원 (한양대학교 일반대학원 생체공학과) ;
유승민 (한양대학교 의생명공학전문대학원 생체의공학과) ;
김주영 (한양대학교 의생명공학전문대학원 생체의공학과) ;
안현준 (한양대학교 의생명공학전문대학원 생체의공학과) ;
조백환 (성균관대학교 삼성융합의과학원 의료기기산업학과) ;
육순현 (한양대학교 일반대학원 생체공학과) ;
김인영 (한양대학교 일반대학원 생체공학과)

Received : 2018.02.12
Accepted : 2018.03.01
Published : 2018.04.30

https://doi.org/10.9718/JBER.2018.39.2.63 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In modern society, speech recognition technology is emerging as an important technology for identification in electronic commerce, forensics, law enforcement, and other systems. In this study, we aim to develop an age classification algorithm for extracting only MFCC(Mel Frequency Cepstral Coefficient) expressing the characteristics of speech in Korean and applying it to deep learning technology. The algorithm for extracting the 13th order MFCC from Korean data and constructing a data set, and using the artificial intelligence algorithm, deep artificial neural network, to classify males in their 20s, 30s, and 50s, and females in their 20s, 40s, and 50s. finally, our model confirmed the classification accuracy of 78.6% and 71.9% for males and females, respectively.

Keywords

References

J.H.L. Hansen and T. Hasan, "Speaker recognition by machines and humans: A tutorial review," IEEE Signal Proc. Mag., vol. 32, no. 6, pp. 74-99, 2015. https://doi.org/10.1109/MSP.2015.2462851
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Muller, C., Narayanan, S, "The INTERSPEECH 2010 Paralinguistic Challenge," In: Proc. INTERSPEECH 2010, Makuhari, Japan, 2010, pp. 2794-2797.
M. Li, K. J. Han, and S. Narayanan, "Automatic speaker age and gender recognition using acoustic and prosodic level information fusion," Computer Speech & Language, vol. 27, no. 1, pp. 151-167, 2013. https://doi.org/10.1016/j.csl.2012.01.008
Phuoc Nguyen, Trung Le, Dat Tran, Xu Huang, and Dharmendra Sharma. "Fuzzy support vector machines for age and gender classification," In INTERSPEECH 2010, Makuhari, Japan, 2010, pp. 2806-2809.
강우현, 이강현, 강태균, 김남수. "I-벡터 특징을 이용하는 NN 기반의 화자 연령 분류,"한국통신학회 학술대회논문집, 2015, pp. 589-590.
Logan, Beth. "Mel Frequency Cepstral Coefficients for Music Modeling," ISMIR, vol. 270, 2000.
Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, 2015.
Katerenchuk, Denys. "Age Group Classification with Speech and Metadata Multimodality Fusion." Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers," vol. 2, 2017.
윤태진, 강윤정, "한국어 대용량발화말뭉치의 단모음분석," 말소리와 음성과학, 제6권, 제3호, 2014, pp. 139-145. https://doi.org/10.13064/KSSS.2014.6.3.139
Muda, L., M. Begam and I. Elamvazuthi (2010). "Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques," arXiv preprint arXiv:1003.4083.
D. Mahmoodi, H. Marvi, M. Taghizadeh, A. Soleimani, F. Razzazi, and M. Mahmoodi, "Age estimation based on speech features and support vector machine," in Proceedings of the 3rd Computer Science and Electronic Engineering Conference (CEEC '11), July. 2011, pp. 60-64.
A. Kumar, P. Agarwal, P. Dighe, S. S. Bhiksha Raj, and K. Prahallad, "Speech Emotion Recognition by AdaBoost Algorithm and Feature Selection for Support Vector Machines," http://home.iitk.ac.in/?subhali/reports/reportiptse.pdf.
KINGMA, Diederik P.; BA, Jimmy. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
B. D. Barkana and J. Zhou, "A new pitch-range based feature set for a speaker's age and gender classification," Appl. Acoust., vol. 98, pp. 52-61, 2015. https://doi.org/10.1016/j.apacoust.2015.04.013

Journal of Biomedical Engineering Research (대한의용생체공학회:의공학회지)

Development of Age Classification Deep Learning Algorithm Using Korean Speech

한국어 음성을 이용한 연령 분류 딥러닝 알고리즘 기술 개발

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)