DOI QR코드

DOI QR Code

경량화된 딥러닝 구조를 이용한 실시간 초고해상도 영상 생성 기술

Deep Learning-based Real-Time Super-Resolution Architecture Design

  • Ahn, Saehyun (Department of Electronic Engineering, Sogang University) ;
  • Kang, Suk-Ju (Department of Electronic Engineering, Sogang University)
  • 투고 : 2021.01.13
  • 심사 : 2021.03.10
  • 발행 : 2021.03.30

초록

초고해상도 변환 문제에서 최근 딥러닝을 사용하면서 큰 성능 개선을 얻고 있다. 빠른 초고해상도 합성곱 신경망 (FSRCNN)은 딥러닝 기반 초고해상도 알고리즘으로 잘 알려져 있으며, 여러 개의 합성곱 층로 추출한 저 해상도의 입력 특징을 활용하여 역합성곱 층에서 초고해상도의 영상을 출력하는 알고리즘이다. 본 논문에서는 병렬 연산 효율성을 고려한 FPGA 기반 합성곱 신경망 가속기를 제안한다. 특히 역합성곱 층을 합성곱 층으로 변환하는 방법을 통해서 에너지 효율적인 가속기를 설계했다. 또한 제안한 방법은 FPGA 리소스를 고려하여 FSRCNN의 구조를 변형한 Optimal-FSRCNN을 제안한다. 사용하는 곱셈기의 개수를 FSRCNN 대비 3.47배 압축하였고, 초고해상도 변환 성능을 평가하는 지표인 PSNR은 FSRCNN과 비슷한 성능을 내고 있다. 이를 통해서 FPGA에 최적화된 네트워크를 구현하여 FHD 입력 영상을 UHD 영상으로 출력하는 실시간 영상처리 기술을 개발했다.

Recently, deep learning technology is widely used in various computer vision applications, such as object recognition, classification, and image generation. In particular, the deep learning-based super-resolution has been gaining significant performance improvement. Fast super-resolution convolutional neural network (FSRCNN) is a well-known model as a deep learning-based super-resolution algorithm that output image is generated by a deconvolutional layer. In this paper, we propose an FPGA-based convolutional neural networks accelerator that considers parallel computing efficiency. In addition, the proposed method proposes Optimal-FSRCNN, which is modified the structure of FSRCNN. The number of multipliers is compressed by 3.47 times compared to FSRCNN. Moreover, PSNR has similar performance to FSRCNN. We developed a real-time image processing technology that implements on FPGA.

키워드

참고문헌

  1. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," In NIPS, pp. 1097-1105, 2012.
  2. A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," In IJCNN, pp. 2047-2052, 2005.
  3. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," In CVPR, pp. 580-587, 2014.
  4. R. Girshick, "Fast R-CNN," In ICCV, 2015.
  5. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards real-time object detection with region proposal networks," In NIPS, pp. 91-99, 2015.
  6. K. He, X. Zhang, S. Ren, and J. Sun, "Mask R-CNN," In ICCV, pp. 2980-2988, 2017.
  7. C. Dong, C. C. Loy, K. He, and X. Tang, "Learning a deep convolutional network for image super-resolution," In Proc. ECCV, 2014. pp.184-199.
  8. Dong Chao, Chen Change Loy, and Xiaoou Tang, "Accelerating the super-resolution convolutional neural network," In ECCV, 2016.
  9. A. Radford et al., "Unsupervised representation learning with deep convolutional generative adversarial networks," arXiv, 2015.
  10. S. Williams et al., "Roofline: an insightful visual performance model for multicore architectures," Commun, ACM, 52(4):65-76, Apr. 2009. https://doi.org/10.1145/1498765.1498785
  11. J.-W. Chang, K.-W. Kang, and S.-J Kang, "SDCNN: An efficient sparse deconvolutional neural network accelerator on FPGA," Proceedings of Design, Automation & Test in Europe (DATE), March. 2019.
  12. Dong C., Loy C. C., He K., and Tang X., "Image superresolution using deep convolutional networks," In TPAMI, pp.295-307, 2015.
  13. Kim J., Kwon Lee J., and Mu Lee K., "Accurate image super-resolution using very deep convolutional networks," In CVPR, 2016.
  14. A. Yazdanbakhsh, K. Samadi, N. S. Kim, and H. Esmaeilzadeh, "GANAX: A unified mimd-simd acceleration for generative adversarial networks," In ISCA, pp. 650-661, 2018.
  15. M. Song, J. Zhang, H. Chen, and T. Li, "Towards efficient microarchitectural design for accelerating unsupervised gan-based deep learning," In HPCA, pp. 66-77, 2018.
  16. D. Xu, K. Tu, Y. Wang, C. Liu, B. He, and H. Li, "FCN-engine: Accelerating deconvolutional layers in classic cnn processors," In ICCAD, 2018.