DOI QR코드

DOI QR Code

MSaGAN: Improved SaGAN using Guide Mask and Multitask Learning Approach for Facial Attribute Editing

  • Yang, Hyeon Seok (Dept. of Computer Science and Engineering, Hanyang University) ;
  • Han, Jeong Hoon (Dept. of Computer Science and Engineering, Hanyang University) ;
  • Moon, Young Shik (Dept. of Computer Science and Engineering, Hanyang University)
  • Received : 2020.03.26
  • Accepted : 2020.05.05
  • Published : 2020.05.29

Abstract

Recently, studies of facial attribute editing have obtained realistic results using generative adversarial net (GAN) and encoder-decoder structure. Spatial attention GAN (SaGAN), one of the latest researches, is the method that can change only desired attribute in a face image by spatial attention mechanism. However, sometimes unnatural results are obtained due to insufficient information on face areas. In this paper, we propose an improved SaGAN (MSaGAN) using a guide mask for learning and applying multitask learning approach to improve the limitations of the existing methods. Through extensive experiments, we evaluated the results of the facial attribute editing in therms of the mask loss function and the neural network structure. It has been shown that the proposed method can efficiently produce more natural results compared to the previous methods.

최근 얼굴 속성 편집(facial attribute editing)의 연구는 GAN(Generative Adversarial Net)과 인코더-디코더(encoder-decoder) 구조를 활용하여 사실적인 결과를 얻고 있다. 최신 연구 중 하나인 SaGAN(Spatial attention GAN)은 공간적 주의 기제(spatial attention mechanism)를 활용하여 얼굴 영상에서 원하는 속성만을 변경할 방법을 제안하였다. 그러나 불충분한 얼굴 영역 정보로 인하여 때로 부자연스러운 결과를 얻는 경우가 발생한다. 본 논문에서는 기존 연구의 한계점을 개선하기 위하여 유도 마스크(guide mask)를 학습에 활용하고, 다중작업 학습(multitask learning) 접근을 적용한 개선된 SaGAN(MSaGAN)을 제안한다. 폭넓은 실험을 통해 마스크 손실 함수와 신경망 구조에 따른 얼굴 속성 편집의 결과를 비교하여 제안하는 방법이 기존보다 더 자연스러운 결과를 효율적으로 얻을 수 있음을 보인다.

Keywords

References

  1. X. Zheng, Y. Guo, H. Huang, Y. Li, and R. He, "A Survey to Deep Facial Attribute Analysis," International Journal of Computer Vision, pp. 1-33, Mar. 2020. DOI: 10.1007/s11263-020-01308-z
  2. T. J. Choi and H. M. Lee "An Algorithm for Converting 2D Face Image into 3D Model," Journal of The Korea Society of Computer and Information, Vol. 20, No. 4, pp. 41-48, Apr. 2015. DOI: 10.9708/jksci.2015.20.4.041
  3. S. C. Bae, Y. S. Lee, and S. W. Choi "Vision-based Authentication and Registration of Facial Identity in Hospital Information System," Journal of The Korea Society of Computer and Information, Vol. 24, No. 12, pp. 59-65, Dec. 2019. DOI: 10.9708/jksci.2019.24.12.059
  4. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative Adversarial Nets," Advances in neural information processing systems, pp. 2672-2680, Dec. 2014.
  5. W. Shen and R. Liu, "Learning Residual Images for Face Attribute Manipulation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4030-4038, Jul. 2017. DOI: 10.1109/CVPR.2017.135
  6. H. S. Yang, J. H. Han, Y. C. Cho, H. G. Lee, Y. Park, and Y. S. Moon, "Study on Performance Improvement of SAGAN using Mask," Proceeding of 2019 Korea Signal Processing Conference, pp. 2557-2560, Sep. 2019.
  7. G. Zhang, M. Kan, S. Shan, and X. Chen, "Generative Adversarial Network with Spatial Attention for Face Attribute Editing," Proceedings of the European Conference on Computer Vision (ECCV), pp. 417-432, Sep. 2018. DOI: 10.1007/978-3-030-01231-1_26
  8. Z. He, W. Zuo, and S. Shan, "AttGAN: Facial Attribute Editing by Only Changing What You Want." IEEE Transactions on Image Processing, Vol. 28, No. 11, pp. 5464-5478, May. 2019. DOI: 10.1109/TIP.2019.2916751
  9. H. S. Yang and Y. S. Moon, "Face Attribute Editing using AttGAN and Guide Mask," 2019 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1-3, Jan. 2019. DOI: 10.23919/ELINFOCOM.2019. 8706471
  10. S. K. Woo, "Generation of Contrast Enhanced Computed Tomography Image using Deep Learning Network," Journal of The Korea Society of Computer and Information, Vol. 24, No. 3, pp. 41-47, Mar. 2019. DOI: 10.9708/jksci.2019.24.03.041
  11. A. Radford, L. Metz, and S. Chintala, "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks," arXiv preprint arXiv:1511.06434v2, pp. 1-16, Jan. 2016.
  12. M. Mirza and S. Osindero, "Conditional Generative Adversarial Nets," arXiv preprint arXiv:1411.1784, pp. 1-7, Nov. 2014.
  13. G. Perarnau, J. V. D. Weijer, B. Raduanu, and J. M. Alvarez, "Invertible Conditional GANs for Image Editing," NIPS 2016 Workshop on Adversarial Training, pp. 1-9, Dec. 2016.
  14. P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efos, "Image-to-Image Translation with Conditional Adversarial Networks," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125-1134, Jul. 2017. DOI: 10.1109/CVPR.2017.632
  15. J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks," Proceedings of the IEEE international conference on computer vision, pp. 2223-2232, Oct. 2017. DOI: 10.1109/ICCV.2017.244
  16. Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, and J. Choo, "StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789-8797, Jun. 2018. DOI: 10.1109/CVPR.2018.00916
  17. Z. Liu, P. Luo, X. Wang, and X. Tang, "Deep Learning Face Attributes in the Wild," Proceedings of the IEEE International Conference on Computer Vision, pp. 3730-3738, Dec. 2015. DOI: 10.1109/ICCV.2015.425
  18. C. H. Lee, Z. Liu, L. Wu, and P. Luo, "MaskGAN: Towards Diverse and Interactive Facial Image Manipulation," arXiv preprint arXiv:1907.11922v2, pp. 1-20, Apr. 2020.