• Title/Summary/Keyword: Deep Neural Network

Search Result 2,081, Processing Time 0.029 seconds

Study on data augmentation methods for deep neural network-based audio tagging (Deep neural network 기반 오디오 표식을 위한 데이터 증강 방법 연구)

  • Kim, Bum-Jun;Moon, Hyeongi;Park, Sung-Wook;Park, Young cheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.37 no.6
    • /
    • pp.475-482
    • /
    • 2018
  • In this paper, we present a study on data augmentation methods for DNN (Deep Neural Network)-based audio tagging. In this system, an audio signal is converted into a mel-spectrogram and used as an input to the DNN for audio tagging. To cope with the problem associated with a small number of training data, we augment the training samples using time stretching, pitch shifting, dynamic range compression, and block mixing. In this paper, we derive optimal parameters and combinations for the augmentation methods through audio tagging simulations.

A Sound Interpolation Method Using Deep Neural Network for Virtual Reality Sound (가상현실 음향을 위한 심층신경망 기반 사운드 보간 기법)

  • Choi, Jaegyu;Choi, Seung Ho
    • Journal of Broadcast Engineering
    • /
    • v.24 no.2
    • /
    • pp.227-233
    • /
    • 2019
  • In this paper, we propose a deep neural network-based sound interpolation method for realizing virtual reality sound. Through this method, sound between two points is generated by using acoustic signals obtained from two points. Sound interpolation can be performed by statistical methods such as arithmetic mean or geometric mean, but this is insufficient to reflect actual nonlinear acoustic characteristics. In order to solve this problem, in this study, the sound interpolation is performed by training the deep neural network based on the acoustic signals of the two points and the target point, and the experimental results show that the deep neural network-based sound interpolation method is superior to the statistical methods.

Performance Comparison of Convolution Neural Network by Weight Initialization and Parameter Update Method1 (가중치 초기화 및 매개변수 갱신 방법에 따른 컨벌루션 신경망의 성능 비교)

  • Park, Sung-Wook;Kim, Do-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.4
    • /
    • pp.441-449
    • /
    • 2018
  • Deep learning has been used for various processing centered on image recognition. One core algorithms of the deep learning, convolutional neural network is an deep neural network that specialized in image recognition. In this paper, we use a convolutional neural network to classify forest insects and propose an optimization method. Experiments were carried out by combining two weight initialization and six parameter update methods. As a result, the Xavier-SGD method showed the highest performance with an accuracy of 82.53% in the 12 different combinations of experiments. Through this, the latest learning algorithms, which complement the disadvantages of the previous parameter update method, we conclude that it can not lead to higher performance than existing methods in all application environments.

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

  • Ren, Qun
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.556-570
    • /
    • 2021
  • The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.

Efficient Iris Recognition using Deep-Learning Convolution Neural Network (딥러닝 합성곱 신경망을 이용한 효율적인 홍채인식)

  • Choi, Gwang-Mi;Jeong, Yu-Jeong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.3
    • /
    • pp.521-526
    • /
    • 2020
  • This paper presents an improved HOLP neural network that adds 25 average values to a typical HOLP neural network using 25 feature vector values as input values by applying high-order local autocorrelation function, which is excellent for extracting immutable feature values of iris images. Compared with deep learning structures with different types, we compared the recognition rate of iris recognition using Back-Propagation neural network, which shows excellent performance in voice and image field, and synthetic product neural network that integrates feature extractor and classifier.

Improvement of the Convergence Rate of Deep Learning by Using Scaling Method

  • Ho, Jiacang;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • v.6 no.4
    • /
    • pp.67-72
    • /
    • 2017
  • Deep learning neural network becomes very popular nowadays due to the reason that it can learn a very complex dataset such as the image dataset. Although deep learning neural network can produce high accuracy on the image dataset, it needs a lot of time to reach the convergence stage. To solve the issue, we have proposed a scaling method to improve the neural network to achieve the convergence stage in a shorter time than the original method. From the result, we can observe that our algorithm has higher performance than the other previous work.

Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model (Deep Neural Network 언어모델을 위한 Continuous Word Vector 기반의 입력 차원 감소)

  • Kim, Kwang-Ho;Lee, Donghyun;Lim, Minkyu;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.3-8
    • /
    • 2015
  • In this paper, we investigate an input dimension reduction method using continuous word vector in deep neural network language model. In the proposed method, continuous word vectors were generated by using Google's Word2Vec from a large training corpus to satisfy distributional hypothesis. 1-of-${\left|V\right|}$ coding discrete word vectors were replaced with their corresponding continuous word vectors. In our implementation, the input dimension was successfully reduced from 20,000 to 600 when a tri-gram language model is used with a vocabulary of 20,000 words. The total amount of time in training was reduced from 30 days to 14 days for Wall Street Journal training corpus (corpus length: 37M words).

Application of artificial neural networks (ANNs) and linear regressions (LR) to predict the deflection of concrete deep beams

  • Mohammadhassani, Mohammad;Nezamabadi-pour, Hossein;Jumaat, Mohd Zamin;Jameel, Mohammed;Arumugam, Arul M.S.
    • Computers and Concrete
    • /
    • v.11 no.3
    • /
    • pp.237-252
    • /
    • 2013
  • This paper presents the application of artificial neural network (ANN) to predict deep beam deflection using experimental data from eight high-strength-self-compacting-concrete (HSSCC) deep beams. The optimized network architecture was ten input parameters, two hidden layers, and one output. The feed forward back propagation neural network of ten and four neurons in first and second hidden layers using TRAINLM training function predicted highly accurate and more precise load-deflection diagrams compared to classical linear regression (LR). The ANN's MSE values are 40 times smaller than the LR's. The test data R value from ANN is 0.9931; thus indicating a high confidence level.

Scene-based Nonuniformity Correction by Deep Neural Network with Image Roughness-like and Spatial Noise Cost Functions

  • Hong, Yong-hee;Song, Nam-Hun;Kim, Dae-Hyeon;Jun, Chan-Won;Jhee, Ho-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.6
    • /
    • pp.11-19
    • /
    • 2019
  • In this paper, a new Scene-based Nonuniformity Correction (SBNUC) method is proposed by applying Image Roughness-like and Spatial Noise cost functions on deep neural network structure. The classic approaches for nonuniformity correction require generally plenty of sequential image data sets to acquire accurate image correction offset coefficients. The proposed method, however, is able to estimate offset from only a couple of images powered by the characteristic of deep neural network scheme. The real world SWIR image set is applied to verify the performance of proposed method and the result shows that image quality improvement of PSNR 70.3dB (maximum) is achieved. This is about 8.0dB more than the improved IRLMS algorithm which preliminarily requires precise image registration process on consecutive image frames.

Automatic Fish Size Measurement System for Smart Fish Farm Using a Deep Neural Network (심층신경망을 이용한 스마트 양식장용 어류 크기 자동 측정 시스템)

  • Lee, Yoon-Ho;Jeon, Joo-Hyeon;Joo, Moon G.
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.3
    • /
    • pp.177-183
    • /
    • 2022
  • To measure the size and weight of the fish, we developed an automatic fish size measurement system using a deep neural network, where the YOLO (You Only Look Once)v3 model was used. To detect fish, an IP camera with infrared function was installed over the fish pool to acquire image data and used as input data for the deep neural network. Using the bounding box information generated as a result of detecting the fish and the structure for which the actual length is known, the size of the fish can be obtained. A GUI (Graphical User Interface) program was implemented using LabVIEW and RTSP (Real-Time Streaming protocol). The automatic fish size measurement system shows the results and stores them in a database for future work.