• Title, Summary, Keyword: Deep Neural Network(DNN)

Search Result 100, Processing Time 0.036 seconds

Data Augmentation for DNN-based Speech Enhancement (딥 뉴럴 네트워크 기반의 음성 향상을 위한 데이터 증강)

  • Lee, Seung Gwan;Lee, Sangmin
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.7
    • /
    • pp.749-758
    • /
    • 2019
  • This paper proposes a data augmentation algorithm to improve the performance of DNN(Deep Neural Network) based speech enhancement. Many deep learning models are exploring algorithms to maximize the performance in limited amount of data. The most commonly used algorithm is the data augmentation which is the technique artificially increases the amount of data. For the effective data augmentation algorithm, we used a formant enhancement method that assign the different weights to the formant frequencies. The DNN model which is trained using the proposed data augmentation algorithm was evaluated in various noise environments. The speech enhancement performance of the DNN model with the proposed data augmentation algorithm was compared with the algorithms which are the DNN model with the conventional data augmentation and without the data augmentation. As a result, the proposed data augmentation algorithm showed the higher speech enhancement performance than the other algorithms.

Deep Learning-based Environment-aware Home Automation System (딥러닝 기반 상황 맞춤형 홈 오토메이션 시스템)

  • Park, Min-ji;Noh, Yunsu;Jo, Seong-jun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • /
    • pp.334-337
    • /
    • 2019
  • In this study, we built the data collection system to learn user's habit data by deep learning and to create an indoor environment according to the situation. The system consists of a data collection server and several sensor nodes, which creates the environment according to the data collected. We used Google Inception v3 network to analyze the photographs and hand-designed second DNN (Deep Neural Network) to infer behaviors. As a result of the DNN learning, we gained 98.4% of Testing Accuracy. Through this results, we were be able to prove that DNN is capable of extrapolating the situation.

  • PDF

Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition (잡음 환경 음성 인식을 위한 심층 신경망 기반의 잡음 오염 함수 예측을 통한 음향 모델 적응 기법)

  • Yoon, Ki-mu;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.1
    • /
    • pp.47-50
    • /
    • 2019
  • This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).

Prediction of Upset Length and Upset Time in Inertia Friction Welding Process Using Deep Neural Network (관성 마찰용접 공정에서 심층 신경망을 이용한 업셋 길이와 업셋 시간의 예측)

  • Yang, Young-Soo;Bae, Kang-Yul
    • Journal of the Korean Society of Manufacturing Process Engineers
    • /
    • v.18 no.11
    • /
    • pp.47-56
    • /
    • 2019
  • A deep neural network (DNN) model was proposed to predict the upset in the inertia friction welding process using a database comprising results from a series of FEM analyses. For the database, the upset length, upset beginning time, and upset completion time were extracted from the results of the FEM analyses obtained with various of axial pressure and initial rotational speed. A total of 35 training sets were constructed to train the proposed DNN with 4 hidden layers and 512 neurons in each layer, which can relate the input parameters to the welding results. The mean of the summation of squared error between the predicted results and the true results can be constrained to within 1.0e-4 after the training. Further, the network model was tested with another 10 sets of welding input parameters and results for comparison with FEM. The test showed that the relative error of DNN was within 2.8% for the prediction of upset. The results of DNN application revealed that the model could effectively provide welding results with respect to the exactness and cost for each combination of the welding input parameters.

Performance Comparison of Deep Feature Based Speaker Verification Systems (깊은 신경망 특징 기반 화자 검증 시스템의 성능 비교)

  • Kim, Dae Hyun;Seong, Woo Kyeong;Kim, Hong Kook
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.9-16
    • /
    • 2015
  • In this paper, several experiments are performed according to deep neural network (DNN) based features for the performance comparison of speaker verification (SV) systems. To this end, input features for a DNN, such as mel-frequency cepstral coefficient (MFCC), linear-frequency cepstral coefficient (LFCC), and perceptual linear prediction (PLP), are first compared in a view of the SV performance. After that, the effect of a DNN training method and a structure of hidden layers of DNNs on the SV performance is investigated depending on the type of features. The performance of an SV system is then evaluated on the basis of I-vector or probabilistic linear discriminant analysis (PLDA) scoring method. It is shown from SV experiments that a tandem feature of DNN bottleneck feature and MFCC feature gives the best performance when DNNs are configured using a rectangular type of hidden layers and trained with a supervised training method.

Performance of music section detection in broadcast drama contents using independent component analysis and deep neural networks (ICA와 DNN을 이용한 방송 드라마 콘텐츠에서 음악구간 검출 성능)

  • Heo, Woon-Haeng;Jang, Byeong-Yong;Jo, Hyeon-Ho;Kim, Jung-Hyun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.19-29
    • /
    • 2018
  • We propose to use independent component analysis (ICA) and deep neural network (DNN) to detect music sections in broadcast drama contents. Drama contents mainly comprise silence, noise, speech, music, and mixed (speech+music) sections. The silence section is detected by signal activity detection. To detect the music section, we train noise, speech, music, and mixed models with DNN. In computer experiments, we used the MUSAN corpus for training the acoustic model, and conducted an experiment using 3 hours' worth of Korean drama contents. As the mixed section includes music signals, it was regarded as a music section. The segmentation error rate (SER) of music section detection was observed to be 19.0%. In addition, when stereo mixed signals were separated into music signals using ICA, the SER was reduced to 11.8%.

DNN-based acoustic modeling for speech recognition of native and foreign speakers (원어민 및 외국인 화자의 음성인식을 위한 심층 신경망 기반 음향모델링)

  • Kang, Byung Ok;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.95-101
    • /
    • 2017
  • This paper proposes a new method to train Deep Neural Network (DNN)-based acoustic models for speech recognition of native and foreign speakers. The proposed method consists of determining multi-set state clusters with various acoustic properties, training a DNN-based acoustic model, and recognizing speech based on the model. In the proposed method, hidden nodes of DNN are shared, but output nodes are separated to accommodate different acoustic properties for native and foreign speech. In an English speech recognition task for speakers of Korean and English respectively, the proposed method is shown to slightly improve recognition accuracy compared to the conventional multi-condition training method.

Trends in Neuromorphic Software Platform for Deep Neural Network (딥 뉴럴 네트워크 지원을 위한 뉴로모픽 소프트웨어 플랫폼 기술 동향)

  • Yu, Misun;Ha, Youngmok;Kim, Taeho
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.4
    • /
    • pp.14-22
    • /
    • 2018
  • Deep neural networks (DNNs) are widely used in various domains such as speech and image recognition. DNN software frameworks such as Tensorflow and Caffe contributed to the popularity of DNN because of their easy programming environment. In addition, many companies are developing neuromorphic processing units (NPU) such as Tensor Processing Units (TPUs) and Graphical Processing Units (GPUs) to improve the performance of DNN processing. However, there is a large gap between NPUs and DNN software frameworks due to the lack of framework support for various NPUs. A bridge for the gap is a DNN software platform including DNN optimized compilers and DNN libraries. In this paper, we review the technical trends of DNN software platforms.

A Heat Stress Detection on Laying Hens Using Deep Neural Network (Deep Neural Network를 이용한 산란계의 고온 스트레스 탐지)

  • Noh, Byeongjoon;Choi, Jangmin;Lee, Jonguk;Park, Daihee;Chung, Younghwa;Chang, Hong-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.776-778
    • /
    • 2015
  • 논문에서는 DNN(Deep Neural Network)의 dropout 기법을 이용하여 산란계가 고온 스트레스를 받고 있는지 여부를 닭의 울음소리 정보를 통해 탐지하는 방법을 제안한다. 실험에서는 $21^{\circ}C$ 정상 온도에서 100개의 소리 데이터, $35^{\circ}C$ 고온에서 200개의 소리 데이터를 사용한다. 먼저, DNN의 학습을 위해서 취득한 울음소리에서 54개의 소리 특징 정보를 추출한다. 둘째, CFS(Correlation Feature Selection)을 이용하여, 추출된 특징 중 온도 구분을 위한 중요한 특정 10개를 선택한다. 셋째, 선택된 소리특징을 DNN에 적용하여 온도 환경을 구분하는 시스템이다. DNN의 과적합(over-fitting) 영향을 감소시키고, 성능 향상을 위하여 dropout 비율을 조정하여 실험을 진행하였다. 본 연구에서는 실제 계사에서 수집된 소리 정보를 이용하여 모의실험을 수행한 결과 매우 우수한 성능을 보임을 확인하였다.

A study on user defined spoken wake-up word recognition system using deep neural network-hidden Markov model hybrid model (Deep neural network-hidden Markov model 하이브리드 구조의 모델을 사용한 사용자 정의 기동어 인식 시스템에 관한 연구)

  • Yoon, Ki-mu;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.2
    • /
    • pp.131-136
    • /
    • 2020
  • Wake Up Word (WUW) is a short utterance used to convert speech recognizer to recognition mode. The WUW defined by the user who actually use the speech recognizer is called user-defined WUW. In this paper, to recognize user-defined WUW, we construct traditional Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), Linear Discriminant Analysis (LDA)-GMM-HMM and LDA-Deep Neural Network (DNN)-HMM based system and compare their performances. Also, to improve recognition accuracy of the WUW system, a threshold method is applied to each model, which significantly reduces the error rate of the WUW recognition and the rejection failure rate of non-WUW simultaneously. For LDA-DNN-HMM system, when the WUW error rate is 9.84 %, the rejection failure rate of non-WUW is 0.0058 %, which is about 4.82 times lower than the LDA-GMM-HMM system. These results demonstrate that LDA-DNN-HMM model developed in this paper proves to be highly effective for constructing user-defined WUW recognition system.