Deep Learning based Singing Voice Synthesis Modeling

Kim, Minae;Kim, Somin;Park, Jihyun;Heo, Gabin;Choi, Yunjeong;

Proceedings of the Korean Institute of Information and Commucation Sciences Conference (한국정보통신학회:학술대회논문집)

2022.10a
/
Pages.127-130
/
2022

The Korea Institute of Information and Commucation Engineering (한국정보통신학회)

Deep Learning based Singing Voice Synthesis Modeling

딥러닝 기반 가창 음성합성(Singing Voice Synthesis) 모델링

Kim, Minae (Ewha Womans University) ;
Kim, Somin (Ewha Womans University) ;
Park, Jihyun (Ewha Womans University) ;
Heo, Gabin (Ewha Womans University) ;
Choi, Yunjeong (Ewha Womans University)

김민애 (이화여자대학교) ;
김소민 (이화여자대학교) ;
박지현 (이화여자대학교) ;
허가빈 (이화여자대학교) ;
최윤정 (이화여자대학교)

Published : 2022.10.03

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

This paper is a study on singing voice synthesis modeling using a generator loss function, which analyzes various factors that may occur when applying BEGAN among deep learning algorithms optimized for image generation to Audio domain. and we conduct experiments to derive optimal quality. In this paper, we focused the problem that the L1 loss proposed in the BEGAN-based models degrades the meaning of hyperparameter the gamma(𝛾) which was defined to control the diversity and quality of generated audio samples. In experiments we show that our proposed method and finding the optimal values through tuning, it can contribute to the improvement of the quality of the singing synthesis product.

본 논문은 생성자 손실함수를 이용한 가창 음성합성 모델링에 대한 연구로서 기존 이미지 생성에 최적화된 딥러닝 알고리즘 중 BEGAN모델을 오디오 생성모델(SVS모델)에 적용시킬 때 발생할 수 있는 여러 요인에 대해 분석하고 최적의 품질을 도출하기 위한 실험을 수행하였다. 특히 BEGAN 기반 모델에서 제안된 L1 loss가 어느 시점에서 감마(𝛾)파라미터의 역할을 상실하게 한다는 점을 개선하고자 알파(𝛼)파라미터를 추가한 후 각 파라미터 값들의 구간별 실험을 통해 최적의 값을 찾아냄으로써 가창합성 생성물의 품질향상에 기여할 수 있음을 확인하였다.

Keywords

Acknowledgement

이 성과는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임.(No.2020R1A2C1 006497).

Proceedings of the Korean Institute of Information and Commucation Sciences Conference (한국정보통신학회:학술대회논문집)

Deep Learning based Singing Voice Synthesis Modeling

딥러닝 기반 가창 음성합성(Singing Voice Synthesis) 모델링

Abstract

Keywords

Acknowledgement

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)