DOI QR코드

DOI QR Code

A Normalized Loss Function of Style Transfer Network for More Diverse and More Stable Transfer Results

다양성 및 안정성 확보를 위한 스타일 전이 네트워크 손실 함수 정규화 기법

  • Choi, Insung (Dept. of AI Software Eng., Seoul Media Institute of Technology) ;
  • Kim, Yong-Goo (Dept. of AI Software Eng., Seoul Media Institute of Technology)
  • 최인성 (서울미디어대학원대학교 인공지능응용소프트웨어학과) ;
  • 김용구 (서울미디어대학원대학교 인공지능응용소프트웨어학과)
  • Received : 2020.10.06
  • Accepted : 2020.11.16
  • Published : 2020.11.30

Abstract

Deep-learning based style transfer has recently attracted great attention, because it provides high quality transfer results by appropriately reflecting the high level structural characteristics of images. This paper deals with the problem of providing more stable and more diverse style transfer results of such deep-learning based style transfer method. Based on the investigation of the experimental results from the wide range of hyper-parameter settings, this paper defines the problem of the stability and the diversity of the style transfer, and proposes a partial loss normalization method to solve the problem. The style transfer using the proposed normalization method not only gives the stability on the control of the degree of style reflection, regardless of the input image characteristics, but also presents the diversity of style transfer results, unlike the existing method, at controlling the weight of the partial style loss, and provides the stability on the difference in resolution of the input image.

딥-러닝 기반 스타일 전이 기법은 영상의 고차원적 구조적 특성을 적절하게 반영하여 높은 품질의 스타일 전이 결과를 제공함으로써 최근 크게 주목받고 있다. 본 논문은 이러한 딥-러닝 기반 스타일 전이 방식의 안정적이고 보다 다양한 스타일 전이 결과 제공에 대한 문제를 다룬다. 스타일 전이를 위한 광범위한 초-매개변수 설정에 따른 실험 결과에 대한 고찰을 바탕으로 스타일 전이 결과의 안정성 및 다양성에 대한 문제를 정의하고, 이러한 문제를 해결하기 위한 부분 손실 정규화 방법을 제안한다. 제안된 정규화 방식을 이용한 스타일 전이는 입력 영상의 특징에 상관없이 초-매개변수 설정을 통해 동일 수준의 스타일 전이 정도를 조절할 수 있을 뿐 아니라, 스타일 손실을 정의하는 계층 별 가중치 설정의 조절을 통해 기존 방식과 달리 보다 다양한 스타일 전이 결과를 제공하며, 입력 영상의 해상도 차이에 대해 보다 안정적인 스타일 전이 결과를 제공하는 특징을 가진다.

Keywords

Acknowledgement

This research is supported by Ministry of Culture, Sports and Tourism and Korea Creative Content Agency (Project Number: R2020040238)

References

  1. A. Efros and T. Leung, "Texture synthesis by non-parametric sampling," Proc. of the IEEE Int. Conf. on Computer Vision(ICCV), 20-25 Sept., Kerkyra, Greece, pp.1033-1038, 1999..
  2. A. Efros and W. Freeman, "Image quilting for texture synthesis and transfer," Proc. of the SIGGRAPH, 12-17 Aug., Los Angeles, U.S.A., pp.341-346, 2001.
  3. D. Heeger and J. Bergen, "Pyramid-based texture analysis/synthesis," Proc. of the SIGGRAPH, 06-11 Aug., Los Angeles, U.S.A., pp.229-238, 1995.
  4. O. Frigo, N. Sabater, J. Delon, and P. Hellier, "Spit and match: example-based adaptive patch sampling for unsupervised style transfer," Proc. of the IEEE Conf. on Computer Vis. and Patt. Recog.(CVPR), 26 June-01 July, Las Vegas, U.S.A., 2016.
  5. L. Gatys, A. Ecker, M. Bethge, A. Hertzmann, and E. Shechtman, "Image style transfer using convolutional neural networks," Proc. of the IEEE Conf. on Computer Vis. and Patt. Recog.(CVPR), 26 June-01 July, Las Vegas, U.S.A., 2016.
  6. Y. Jing, Y. Liu, Y. Yang, Z. Feng, Y. Yu, D. Tao, and M. Song, "Stroke controllable fast style transfer with adaptive receptive fields," Proc. of the European Conf. on Computer Vision(ECCV), 08-14 Sept., Munich, Germany, 2018.
  7. L. Gatys, A. Ecker, M. Bethge, A. Hertzman, and E. Shechtman, "Controlling perceptual factors in neural style transfer," Proc. of the IEEE Conf. on Computer Vis. and Patt. Recog.(CVPR), 22-25 July, Honolulu, U.S.A., 2017.
  8. A. Gupta, J. Johnson, A. Alahi, and L. Fei-Fei, "Characterizing and improving stability in neural style transfer," Proc. of the IEEE Int. Conf. on Computer Vision(ICCV), 22-29 Oct., Venice, Italy, 2017.
  9. H. Huang, H. Wang, W. Luo, L. Ma, W. Jiang, X. Zhu, Z. Li, and W. Liu, "Real-time neural style transfer for videos," Proc. of the IEEE Conf. on Computer Vis. and Patt. Recog.(CVPR), 22-25 July, Honolulu, U.S.A., 2017.
  10. M. Ruder, A. Dosovitskiy, and T. Brox, "Artistic style transfer for videos and spherical images," Int. J. of Computer Vision, vol.126, no.11, pp.1199-1219, 2018. https://doi.org/10.1007/s11263-018-1089-z
  11. J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," Proc. of the European Conf. on Computer Vision(ECCV),8-16 Oct., Amsterdam, Netherland, 2016.
  12. D. Ulyanov, A. Vedaldi, and V. Lempitsky, "Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis," Proc. of the IEEE Conf. on Computer Vis. and Patt. Recog.(CVPR), 22-25 July, Honolulu, U.S.A., 2017.
  13. V. Dumoulin, J. Shlens, and M. Kudlur, "A learned representation for artistic style," Proc. of the Int. Conf. on Learning Representations, 24-26 Apr., Paris France, 2017.
  14. X. Huang and S. Belongie, "Arbitrary style transfer in real-time with adaptive instance normalization," Proc. of the IEEE Int. Conf. on Computer Vision(ICCV), 22-29 Oct., Venice, Italy, 2017.
  15. Y. Li, C. Fang, J. Yang, Z. Wang, X. Lu, and M. Yang, "Universal style transfer via feature transforms," Proc. of the Conf. on Neural Information Processing Systems (NIPS), 04-09 Dec., Long Beach U.S.A., 2017.
  16. L. Sheng, Z. Lin, J. Shao, and X. Wang, "Avartar-net: multi-scale zero-shot style transfer by feature decoration," Proc. of the IEEE Conf. on Computer Vis. and Patt. Recog.(CVPR),18-22 June, Salt Lake City U.S.A., 2018.
  17. K. Simonyan and A. Zissermann, "Very deep convolutional networks for large-scale image recognition," Proc. of the Int. Conf. on Learning and Representations(ICLR),7-9 May, San Diego U.S.A., 2015.
  18. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural Comput., vol.1, no.4, pp.541-551, Dec. 1989. https://doi.org/10.1162/neco.1989.1.4.541
  19. J. Yosinski, J. Clune, . Anh Nguyen, T. Fucht, and H. Lipson, "Understanding neural networks through deep visualization," arXiv:1506.06579, 2015, http://arxiv.org/abs/1506.06579 (accessed Nov. 18, 2020)
  20. W. Samek, A. Binder, G. Montavon, S. Bach, and K.-R. Muller, "Evaluating the visualization of what a deep neural network has learned," IEEE Trans. on Neural Networks and Learning Systems, vol.28, no.11, pp.2660-2673, Nov. 2017. https://doi.org/10.1109/TNNLS.2016.2599820
  21. S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," Proc. Int. Conf. on Machine Learning, 06-11 July, Lille, France, 2015.
  22. M. Abadi, et. al., "Tensorflow: A system for large-scale machine learning," Proc. OSDI'16, Nov.2-4, GA, U.S.A., 2016.
  23. D. Liu and J. Nocedal, "On the limited memory BFGS method for large scale optimization," Mathematical Programming, vol.45, Issue1-3, pp.503-528, Springer 1989. https://doi.org/10.1007/BF01589116
  24. Fast Photo Style, https://github.com/NVIDIA/FastPhotoStyle (accessed Nov. 18, 2020)
  25. neural style transfer using tf.keras, https://www.tensorflow.org/tutorials/generative/style_transfer (accessed Nov. 18, 2020)
  26. KOCCA, Development of Interactive Artificial Intelligence Atelier, https://doi.org/10.23000/TRKO201900002287 (accessed Nov. 18, 2020)