DOI QR코드

DOI QR Code

딥러닝 기반 거리 영상의 Semantic Segmentation을 위한 Atrous Residual U-Net

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning

  • 신석용 (광운대학교 플라즈마바이오디스플레이학과) ;
  • 이상훈 (광운대학교 인제니움학부) ;
  • 한현호 (울산대학교 교양대학)
  • Shin, SeokYong (Department of Plasma Bio Display, Kwangwoon University) ;
  • Lee, SangHun (Ingenium College of Liberal Arts, Kwangwoon University) ;
  • Han, HyunHo (College of General Education, University of Ulsan)
  • 투고 : 2021.08.04
  • 심사 : 2021.10.20
  • 발행 : 2021.10.28

초록

본 논문에서는 U-Net 기반의 semantic segmentation 방법에서 정확도를 개선하기 위한 Atrous Residual U-Net (AR-UNet)을 제안하였다. U-Net은 의료 영상 분석, 자율주행 자동차, 원격 감지 영상 등의 분야에서 주로 사용된다. 기존 U-Net은 인코더 부분에서 컨볼루션 계층 수가 적어 추출되는 특징이 부족하다. 추출된 특징은 객체의 범주를 분류하는 데 필수적이며, 부족할 경우 분할 정확도를 저하시키는 문제를 초래한다. 따라서 이 문제를 개선하기 위해 인코더에 residual learning과 ASPP를 활용한 AR-UNet을 제안하였다. Residual learning은 특징 추출 능력을 개선하고, 연속적인 컨볼루션으로 발생하는 특징 손실과 기울기 소실 문제 방지에 효과적이다. 또한 ASPP는 특징맵의 해상도를 줄이지 않고 추가적인 특징 추출이 가능하다. 실험은 Cityscapes 데이터셋으로 AR-UNet의 효과를 검증하였다. 실험 결과는 AR-UNet이 기존 U-Net과 비교하여 향상된 분할 결과를 보였다. 이를 통해 AR-UNet은 정확도가 중요한 여러 응용 분야의 발전에 기여할 수 있다.

In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.

키워드

참고문헌

  1. S. Y. Shin, S. H. Lee & H, H, Han (2021). A Study on Residual U-Net for Semantic Segmentation based on Deep Learning. Journal of Digital Convergence, 19(6), 251-258. DOI : 10.14400/JDC.2021.19.6.251
  2. S. Shin, H. Han & S. H. Lee. (2021). Improved YOLOv3 with duplex FPN for object detection based on deep learning. The International Journal of Electrical Engineering & Education, 002072092098352. DOI : 10.1177/0020720920983524
  3. A. Kirillov, K. He, R. Girshick, C. Rother & P. Dollar. (2019). Panoptic Segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019-June, 9396-9405. DOI : 10.1109/CVPR.2019.00963
  4. S. Y. Shin, S. H. Lee & J. S. Kim (2021, January) Modified Encoder-Decoder model of U-Net for Semantic Segmentation based on Deep Learning. The 7th International Conference on Small & Medium Business. (pp.379-380). Jeju : SMB.
  5. E. Shelhamer, J. Long & T. Darrell. (2017). Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640-651. DOI : 10.1109/TPAMI.2016.2572683
  6. O. Ronneberger, P. Fischer & T. Brox. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer.
  7. V. Badrinarayanan, A. Kendall & R. Cipolla. (2017). SegNet: A Deep Convolutional EncoderDecoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. DOI : 10.1109/TPAMI.2016.2644615
  8. E. Sovetkin, E. J. Achterberg, T. Weber & B. E. Pieters. (2021). Encoder-Decoder Semantic Segmentation Models for Electroluminescence Images of Thin-Film Photovoltaic Modules. IEEE Journal of Photovoltaics, 11(2), 444-452. DOI : 10.1109/JPHOTOV.2020.3041240
  9. S. Estrada, S. Conjeti, M. Ahmad, N. Navab & M. Reuter. (2018). Competition vs. Concatenation in Skip Connections of Fully Convolutional Networks, In International Workshop on Machine Learning in Medical Imaging (pp. 214-222). DOI : 10.1007/978-3-030-00919-9_25
  10. F. Chollet. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017-Janua, (pp. 1800-1807). DOI : 10.1109/CVPR.2017.195
  11. K. He, X. Zhang, S. Ren & J. Sun. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  12. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy & A. L. Yuille. (2018). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848. DOI : 10.1109/TPAMI.2017.2699184
  13. L. Chen, G. Papandreou, F. Schroff & H. Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv preprint arXiv:1706.05587
  14. I. Loshchilov & F. Hutter (2019). Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR 2019. . arXiv preprint arXiv:1711.05101.
  15. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth & B. Schiele. (2016). The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016-Decem, (pp. 3213-3223). DOI : 10.1109/CVPR.2016.350
  16. H. Zhao, X. Qi, X. Shen, J. Shi & J. Jia. (2018). ICNet for Real-Time Semantic Segmentation on High-Resolution Images. In European Conference on Computer Vision (pp. 418-434). DOI : 10.1007/978-3-030-01219-9_25