Search | Korea Science

Multi-resolution DenseNet based acoustic models for reverberant speech recognition (잔향 환경 음성인식을 위한 다중 해상도 DenseNet 기반 음향 모델)

Park, Sunchan;Jeong, Yongwon;Kim, Hyung Soon
- Phonetics and Speech Sciences
- /
- v.10 no.1
- /
- pp.33-38
- /
- 2018
Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.
https://doi.org/10.13064/KSSS.2018.10.1.033 인용 PDF KSCI

A study on training DenseNet-Recurrent Neural Network for sound event detection (음향 이벤트 검출을 위한 DenseNet-Recurrent Neural Network 학습 방법에 관한 연구)

Hyeonjin Cha;Sangwook Park
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.5
- /
- pp.395-401
- /
- 2023
Sound Event Detection (SED) aims to identify not only sound category but also time interval for target sounds in an audio waveform. It is a critical technique in field of acoustic surveillance system and monitoring system. Recently, various models have introduced through Detection and Classification of Acoustic Scenes and Events (DCASE) Task 4. This paper explored how to design optimal parameters of DenseNet based model, which has led to outstanding performance in other recognition system. In experiment, DenseRNN as an SED model consists of DensNet-BC and bi-directional Gated Recurrent Units (GRU). This model is trained with Mean teacher model. With an event-based f-score, evaluation is performed depending on parameters, related to model architecture as well as model training, under the assessment protocol of DCASE task4. Experimental result shows that the performance goes up and has been saturated to near the best. Also, DenseRNN would be trained more effectively without dropout technique.
https://doi.org/10.7776/ASK.2023.42.5.395 인용 PDF

A Study on Classification Performance Analysis of Convolutional Neural Network using Ensemble Learning Algorithm (앙상블 학습 알고리즘을 이용한 컨벌루션 신경망의 분류 성능 분석에 관한 연구)

Park, Sung-Wook;Kim, Jong-Chan;Kim, Do-Yeon
- Journal of Korea Multimedia Society
- /
- v.22 no.6
- /
- pp.665-675
- /
- 2019
In this paper, we compare and analyze the classification performance of deep learning algorithm Convolutional Neural Network(CNN) ac cording to ensemble generation and combining techniques. We used several CNN models(VGG16, VGG19, DenseNet121, DenseNet169, DenseNet201, ResNet18, ResNet34, ResNet50, ResNet101, ResNet152, GoogLeNet) to create 10 ensemble generation combinations and applied 6 combine techniques(average, weighted average, maximum, minimum, median, product) to the optimal combination. Experimental results, DenseNet169-VGG16-GoogLeNet combination in ensemble generation, and the product rule in ensemble combination showed the best performance. Based on this, it was concluded that ensemble in different models of high benchmarking scores is another way to get good results.
https://doi.org/10.9717/kmms.2019.22.6.665 인용 PDF KSCI HTML

Attention Gated FC-DenseNet for Extracting Crop Cultivation Area by Multispectral Satellite Imagery (다중분광밴드 위성영상의 작물재배지역 추출을 위한 Attention Gated FC-DenseNet)

Seong, Seon-kyeong;Mo, Jun-sang;Na, Sang-il;Choi, Jae-wan
- Korean Journal of Remote Sensing
- /
- v.37 no.5_1
- /
- pp.1061-1070
- /
- 2021
In this manuscript, we tried to improve the performance of the FC-DenseNet by applying an attention gate for the classification of cropping areas. The attention gate module could facilitate the learning of a deep learning model and improve the performance of the model by injecting of spatial/spectral weights to each feature map. Crop classification was performed in the onion and garlic regions using a proposed deep learning model in which an attention gate was added to the skip connection part of FC-DenseNet. Training data was produced using various PlanetScope satellite imagery, and preprocessing was applied to minimize the problem of imbalanced training dataset. As a result of the crop classification, it was verified that the proposed deep learning model can more effectively classify the onion and garlic regions than existing FC-DenseNet algorithm.
https://doi.org/10.7780/kjrs.2021.37.5.1.18 인용 PDF KSCI HTML

Korean Sentiment Analysis using Multi-channel and Densely Connected Convolution Networks (Multi-channel과 Densely Connected Convolution Networks을 이용한 한국어 감성분석)

Yoon, Min-Young;Koo, Min-Jae;Lee, Byeong Rae
- Proceedings of the Korea Information Processing Society Conference
- /
- 2019.05a
- /
- pp.447-450
- /
- 2019
본 논문은 한국어 문장의 감성 분류를 위해 문장의 형태소, 음절, 자소를 입력으로 하는 합성곱층과 DenseNet 을 적용한 Text Multi-channel DenseNet 모델을 제안한다. 맞춤법 오류, 음소나 음절의 축약과 탈락, 은어나 비속어의 남용, 의태어 사용 등 문법적 규칙에 어긋나는 다양한 표현으로 인해 단어 기반 CNN 으로 추출 할 수 없는 특징들을 음절이나 자소에서 추출 할 수 있다. 한국어 감성분석에 형태소 기반 CNN 이 많이 쓰이고 있으나, 본 논문에서 제안한 Text Multi-channel DenseNet 모델은 형태소, 음절, 자소를 동시에 고려하고, DenseNet 에 정보를 밀집 전달하여 문장의 감성 분류의 정확도를 개선하였다. 네이버 영화 리뷰 데이터를 대상으로 실험한 결과 제안 모델은 85.96%의 정확도를 보여 Multi-channel CNN 에 비해 1.45% 더 정확하게 문장의 감성을 분류하였다.
https://doi.org/10.3745/PKIPS.y2019m05a.447 인용 PDF

DenseNet based Image Compression (DenseNet 기반의 이미지 압축)

Park, Woonsung;Kim, Munchurl
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2018.06a
- /
- pp.272-275
- /
- 2018
본 논문에서는 기존 신경망 기반의 이미지 압축에 많이 사용되었던 신경망인 ResNet 을 대신하여 더 적은 개수의 파라미터를 사용하여 좋은 성능을 낼 수 있는 신경망 구조인 DenseNet 을 이미지 압축에 사용한다. 이미지 압축을 위해 사용되는 신경망 구조는 일반적으로 오토 인코더 구조인데, 병목 층에서 정보 손실이 상당히 많이 발생한다. 따라서 이미지 압축에서 신경망 내에서의 정보 전달은 상당히 중요하다. 기존의 논문에서는 이를 위해 이전의 정보를 그대로 뒤로 전달해주는 구조인 ResNet 을 사용하여 깊은 층에 대해서도 수렴이 잘 되는 결과를 보여주었다. 그러나 많은 수의 파라미터를 사용하는 단점을 해결하기 위해 본 논문에서는 DenseNet 을 이미지 압축에 사용하였고, 병목 층에서의 정보 손실로 인해 이미지의 고주파수 성분이 사라지는 현상을 해결하기 위해 원래 이미지와 JPEG2000 으로 압축한 이미지와의 차이를 추가 입력으로 넣어주어서 주관적인 화질을 개선하였다.
PDF

Extraction of Worker Behavior at Manufacturing Site using Mask R-CNN and Dense-Net (Mask R-CNN과 Dense-Net을 이용한 제조 현장에서의 작업자 행동 추출)

Rijayanti, Rita;Hwang, Mintae;Jin, Kyohong
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.05a
- /
- pp.150-153
- /
- 2022
This paper reports a technique that automatically extracts object shapes through Dense-Net, and subsequently, detects the objects using Mask R-CNN in a manufacturing site, in which workers and objects are mixed. It is based on the customized factory dataset by targeting workers, machines, tools, control boxes, and products as the objects. Mask R-CNN supports multi-object recognition as a well-known object recognition method, while Dense-Net effectively extracts a feature from multiple and overlapping objects. After immediate implementation using the two technologies, the object is naturally extracted from a still image of the manufacturing site to describe image. Afterwards, the result is planned to be used to detect workers' abnormal behavior by adding a label on the objects.
PDF

Deep Learning Models for Autonomous Crack Detection System (자동화 균열 탐지 시스템을 위한 딥러닝 모델에 관한 연구)

Ji, HongGeun;Kim, Jina;Hwang, Syjung;Kim, Dogun;Park, Eunil;Kim, Young Seok;Ryu, Seung Ki
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.5
- /
- pp.161-168
- /
- 2021
Cracks affect the robustness of infrastructures such as buildings, bridge, pavement, and pipelines. This paper presents an automated crack detection system which detect cracks in diverse surfaces. We first constructed the combined crack dataset, consists of multiple crack datasets in diverse domains presented in prior studies. Then, state-of-the-art deep learning models in computer vision tasks including VGG, ResNet, WideResNet, ResNeXt, DenseNet, and EfficientNet, were used to validate the performance of crack detection. We divided the combined dataset into train (80%) and test set (20%) to evaluate the employed models. DenseNet121 showed the highest accuracy at 96.20% with relatively low number of parameters compared to other models. Based on the validation procedures of the advanced deep learning models in crack detection task, we shed light on the cost-effective automated crack detection system which can be applied to different surfaces and structures with low computing resources.
https://doi.org/10.3745/KTSDE.2021.10.5.161 인용 PDF KSCI

Multi-band multi-scale DenseNet with dilated convolution for background music separation (배경음악 분리를 위한 확장된 합성곱을 이용한 멀티 밴드 멀티 스케일 DenseNet)

Heo, Woon-Haeng;Kim, Hyemi;Kwon, Oh-Wook
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.6
- /
- pp.697-702
- /
- 2019
We propose a multi-band multi-scale DenseNet with dilated convolution that separates background music signals from broadcast content. Dilated convolution can learn the multi-scale context information represented by spectrogram. In computer simulation experiments, the proposed architecture is shown to improve Signal to Distortion Ratio (SDR) by 0.15 dB and 0.27 dB in 0dB and -10 dB Signal to Noise Ratio (SNR) environments, respectively.
https://doi.org/10.7776/ASK.2019.38.6.697 인용 PDF KSCI

Camera Model Identification Using Modified DenseNet and HPF (변형된 DenseNet과 HPF를 이용한 카메라 모델 판별 알고리즘)

Lee, Soo-Hyeon;Kim, Dong-Hyun;Lee, Hae-Yeoun
- The Journal of Korean Institute of Information Technology
- /
- v.17 no.8
- /
- pp.11-19
- /
- 2019
Against advanced image-related crimes, a high level of digital forensic methods is required. However, feature-based methods are difficult to respond to new device features by utilizing human-designed features, and deep learning-based methods should improve accuracy. This paper proposes a deep learning model to identify camera models based on DenseNet, the recent technology in the deep learning model field. To extract camera sensor features, a HPF feature extraction filter was applied. For camera model identification, we modified the number of hierarchical iterations and eliminated the Bottleneck layer and compression processing used to reduce computation. The proposed model was analyzed using the Dresden database and achieved an accuracy of 99.65% for 14 camera models. We achieved higher accuracy than previous studies and overcome their disadvantages with low accuracy for the same manufacturer.
https://doi.org/10.14801/jkiit.2019.17.8.11 인용

Search Result 144, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)