• Title/Summary/Keyword: DNN (Deep Neural Network)

Search Result 38, Processing Time 0.091 seconds

Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition (잡음 환경 음성 인식을 위한 심층 신경망 기반의 잡음 오염 함수 예측을 통한 음향 모델 적응 기법)

  • Yoon, Ki-mu;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.1
    • /
    • pp.47-50
    • /
    • 2019
  • This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).

Study on data augmentation methods for deep neural network-based audio tagging (Deep neural network 기반 오디오 표식을 위한 데이터 증강 방법 연구)

  • Kim, Bum-Jun;Moon, Hyeongi;Park, Sung-Wook;Park, Young cheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.37 no.6
    • /
    • pp.475-482
    • /
    • 2018
  • In this paper, we present a study on data augmentation methods for DNN (Deep Neural Network)-based audio tagging. In this system, an audio signal is converted into a mel-spectrogram and used as an input to the DNN for audio tagging. To cope with the problem associated with a small number of training data, we augment the training samples using time stretching, pitch shifting, dynamic range compression, and block mixing. In this paper, we derive optimal parameters and combinations for the augmentation methods through audio tagging simulations.

Image Restoration using GAN (적대적 생성신경망을 이용한 손상된 이미지의 복원)

  • Moon, ChanKyoo;Uh, YoungJung;Byun, Hyeran
    • Journal of Broadcast Engineering
    • /
    • v.23 no.4
    • /
    • pp.503-510
    • /
    • 2018
  • Restoring of damaged images is a fundamental problem that was attempted before digital image processing technology appeared. Various algorithms for reconstructing damaged images have been introduced. However, the results show inferior restoration results compared with manual restoration. Recent developments of DNN (Deep Neural Network) have introduced various studies that apply it to image restoration. However, if the wide area is damaged, it can not be solved by a general interpolation method. In this case, it is necessary to reconstruct the damaged area through contextual information of surrounding images. In this paper, we propose an image restoration network using a generative adversarial network (GAN). The proposed system consists of image generation network and discriminator network. The proposed network is verified through experiments that it is possible to recover not only the natural image but also the texture of the original image through the inference of the damaged area in restoring various types of images.

A study on Gaussian mixture model deep neural network hybrid-based feature compensation for robust speech recognition in noisy environments (잡음 환경에 효과적인 음성 인식을 위한 Gaussian mixture model deep neural network 하이브리드 기반의 특징 보상)

  • Yoon, Ki-mu;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.37 no.6
    • /
    • pp.506-511
    • /
    • 2018
  • This paper proposes an GMM(Gaussian Mixture Model)-DNN(Deep Neural Network) hybrid-based feature compensation method for effective speech recognition in noisy environments. In the proposed algorithm, the posterior probability for the conventional GMM-based feature compensation method is calculated using DNN. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed GMM-DNN hybrid-based feature compensation method shows more effective in Known and Unknown noisy environments compared to the GMM-based method. In particular, the experiments of the Unknown environments show 9.13 % of relative improvement in the average of WER (Word Error Rate) and considerable improvements in lower SNR (Signal to Noise Ratio) conditions such as 0 and 5 dB SNR.

UI Elements Identification for Mobile Applications based on Deep Learning using Symbol Marker (심볼마커를 사용한 딥러닝 기반 모바일 응용 UI 요소 인식)

  • Park, Jisu;Jung, Jinman;Eun, Seungbae;Yun, Young-Sun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.3
    • /
    • pp.89-95
    • /
    • 2020
  • Recently, studies are being conducted to recognize a sketch image of a GUI (Graphical User Interface) based on a deep learning and to make it into a code implemented in an application. UI / UX designers can communicate with developers through storyboards when developing mobile applications. However, UI / UX designers can create different widgets for ambiguous widgets. In this paper, we propose an automatic UI detection method using symbol markers to improve the accuracy of DNN (Deep Neural Network) based UI identification. In order to evaluate the performance with or without the symbol markers, their accuracy is compared. In order to improve the accuracy according to of the symbol marker, the results are analyzed when the shape is a circle or a parenthesis. The use of symbol markers will reduce feedback between developer and designer, time and cost, and reduce sketch image UI false positives and improve accuracy.

Speech Recognition Error Detection Using Deep Learning (딥 러닝을 이용한 음성인식 오류 판별 방법)

  • Kim, Hyun-Ho;Yun, Seung;Kim, Sang-Hun
    • Annual Conference on Human and Language Technology
    • /
    • /
    • pp.157-162
    • /
    • 2015
  • 자동통역(Speech-to-speech translation)의 최우선 단계인 음성인식과정에서 발생한 오류문장은 대부분 비문법적 구조를 갖거나 의미를 이해할 수 없는 문장들이다. 이러한 문장으로 자동번역을 할 경우 심각한 통역오류가 발생하게 되어 이에 대한 개선이 반드시 필요한 상황이다. 이에 본 논문에서는 음성인식 오류문장이 정상적인 인식문장에 비해 비문법적이거나 무의미하다는 특징을 이용하여 DNN(Deep Neural Network) 기반 음성인식오류 판별기를 구현하였으며 84.20%의 오류문장 분류성능결과를 얻었다.

  • PDF

Indoor Space Recognition using Super-pixel and DNN (DNN과 슈퍼픽셀을 이용한 실내 공간 인식)

  • Kim, Kisang;Choi, Hyung-Il
    • Journal of Internet Computing and Services
    • /
    • v.19 no.3
    • /
    • pp.43-48
    • /
    • 2018
  • In this paper, we propose an indoor-space recognition using DNN and super-pixel. In order to recognize the indoor space from the image, segmentation process is required for dividing an image Super-pixel is performed algorithm which can be divided into appropriate sizes. In order to recognize each segment, features are extracted using a proposed method. Extracted features are learned using DNN, and each segment is recognized using the DNN model. Experimental results show the performance comparison between the proposed method and existing algorithms.

Data Augmentation for DNN-based Speech Enhancement (딥 뉴럴 네트워크 기반의 음성 향상을 위한 데이터 증강)

  • Lee, Seung Gwan;Lee, Sangmin
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.7
    • /
    • pp.749-758
    • /
    • 2019
  • This paper proposes a data augmentation algorithm to improve the performance of DNN(Deep Neural Network) based speech enhancement. Many deep learning models are exploring algorithms to maximize the performance in limited amount of data. The most commonly used algorithm is the data augmentation which is the technique artificially increases the amount of data. For the effective data augmentation algorithm, we used a formant enhancement method that assign the different weights to the formant frequencies. The DNN model which is trained using the proposed data augmentation algorithm was evaluated in various noise environments. The speech enhancement performance of the DNN model with the proposed data augmentation algorithm was compared with the algorithms which are the DNN model with the conventional data augmentation and without the data augmentation. As a result, the proposed data augmentation algorithm showed the higher speech enhancement performance than the other algorithms.