DOI QR코드

DOI QR Code

Few-Shot Image Synthesis using Noise-Based Deep Conditional Generative Adversarial Nets

  • Msiska, Finlyson Mwadambo (School of Computer Science and Engineering, Soongsil University) ;
  • Hassan, Ammar Ul (School of Computer Science and Engineering, Soongsil University) ;
  • Choi, Jaeyoung (School of Computer Science and Engineering, Soongsil University) ;
  • Yoo, Jaewon (Department of Small Business and Entrepreneurship, Soongsil University)
  • 투고 : 2020.10.12
  • 심사 : 2021.02.02
  • 발행 : 2021.03.31

초록

In recent years research on automatic font generation with machine learning mainly focus on using transformation-based methods, in comparison, generative model-based methods of font generation have received less attention. Transformation-based methods learn a mapping of the transformations from an existing input to a target. This makes them ambiguous because in some cases a single input reference may correspond to multiple possible outputs. In this work, we focus on font generation using the generative model-based methods which learn the buildup of the characters from noise-to-image. We propose a novel way to train a conditional generative deep neural model so that we can achieve font style control on the generated font images. Our research demonstrates how to generate new font images conditioned on both character class labels and character style labels when using the generative model-based methods. We achieve this by introducing a modified generator network which is given inputs noise, character class, and style, which help us to calculate losses separately for the character class labels and character style labels. We show that adding the character style vector on top of the character class vector separately gives the model rich information about the font and enables us to explicitly specify not only the character class but also the character style that we want the model to generate.

키워드

참고문헌

  1. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio, "Generative adversarial networks," NIPS, Proc. of Neural Information Processing Systems, pp. 2672-2680, 2014
  2. S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, "Generative adversarial text to image synthesis," arXiv preprint 1605.05396, 2016
  3. P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-Image translation with conditional adversarial networks," CVPR, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017
  4. J. Brownlee, "Tips for Training Stable Generative Adversarial Networks, https://machinelearningmastery.com/how-to-train-stable-generative-adversarial-networks/(accessed July 24, 2020)
  5. Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, "StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation," CVPR, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2018
  6. Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, "StarGAN v2: Diverse image synthesis for multiple domains," arXiv preprint arXiv:1912.01865, 2019
  7. L. A. Gatys, A. S. Ecker, and M. Bethge, "Image style transfer using convolutional neural networks," CVPR, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414-2423, 2016
  8. G. Atarsaikhan, B. K. Iwana, A. Narusawa, K. Yanai, and S. Uchida, "Neural font style transfer," ICDAR, Proc. of 14th International Conference on Document Analysis and Recognition, pp. 51-56, 2017
  9. Y. Tian, "Rewrite: Neural style transfer for Chinese fonts," https://github.com/kaonashityc/Rewrite(accessed April 25, 2020)
  10. Y. Tian, "Zi2Zi: Master Chinese calligraphy with conditional adversarial networks," https://github.com/kaonashi-tyc/zi2zi (accessed April 25, 2020)
  11. Y. Jiang, Z. Lian, and Y. Tang, and J. Xiao, "DCFont: An end-to-end deep Chinese font generation system," SIGGRAPH Asia, Proc. of SIGGRAPH Asia Technical Briefs, 2017
  12. Y. Jiang, Z. Lian, Y. Tang, and J. Xiao, "SCFont: Structure-guided Chinese font generation via deep stacked networks," AAAI, Proc. of 33rd AAAI Conference on Artificial Intelligence, 2019
  13. H. Hayashi, K. Abe, and S. Uchida, "GlyphGAN: Style-consistent font generation based on generative adversarial networks," arXiv preprint arXiv:1905.12502, 2019
  14. M. Arjovsky, S. Chintala, and L. Bottou, "Wasserstein generative adversarial networks," Machine Learning, Proc. of 34th International Conference on Machine Learning, pp. 214-223, 2017
  15. A. Radford, L. Metz, and S. Chintala, "Unsupervised representation learning with deep convolutional generative adversarial networks," arXiv preprint arXiv:1511.06434, 2015
  16. M. Mirza, and S. Osindero, "Conditional generative adversarial nets," arXiv preprint arXiv:1411.1784, 2014
  17. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," NIPS, Proc. of Neural Information Processing Systems, pp. 1097-1105, 2012
  18. S. Noh, "Classification of Clothing Using Googlenet Deep Learning and IoT based on Artificial Intelligence," Smart Media Journal, vol.9, no.3, pp. 41-45, 2020
  19. Y. Kim, and J. Kim, "A Study on the Performance of Enhanced Deep Fully Convolutional Neural Network Algorithm for Image Object Segmentation in Autonomous Driving Environment," Smart Media Journal, vol.9, no.4, pp. 9-16, 2020
  20. S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating deep network training by reducing internal covariate shift," ICML, Proc. of International Conference on Machine Learning, 2015
  21. IBM; "Handwritten Korean Character Recognition with TensorFlow and Android," https://github.com/IBM/tensorflow-hangul-recognition (access ed July 27, 2020)
  22. J. Nilsson, T. Akenine-Moller, "Understanding SSIM," arXiv preprint arXiv:2006.13846, 20