DOI QR코드

DOI QR Code

Training Method for Enhancing Classification Accuracy of Kuzushiji-MNIST/49 using Deep Learning based on CNN

CNN기반 딥러닝을 이용한 Kuzushiji-MNIST/49 분류의 정확도 향상을 위한 학습 방안

  • Park, Byung-Seo (Department of Electronic Material Engineering, Kwangwoon University) ;
  • Lee, Sungyoung (Ingenium College of Liberal Arts, Kwangwoon University) ;
  • Seo, Young-Ho (Department of Electronic Material Engineering, Kwangwoon University)
  • Received : 2019.10.26
  • Accepted : 2020.02.14
  • Published : 2020.03.31

Abstract

In this paper, we propose a deep learning training method for accurately classifying Kuzushiji-MNIST and Kuzushiji-49 datasets for ancient and medieval Japanese characters. We analyze the latest convolutional neural network networks through experiments to select the most suitable network, and then use the networks to select the number of training to classify Kuzushiji-MNIST and Kuzushiji-49 datasets. In addition, the training is conducted with high accuracy by applying learning methods such as Mixup and Random Erase. As a result of the training, the accuracy of the proposed method can be shown to be high by 99.75% for MNIST, 99.07% for Kuzushiji-MNIST, and 97.56% for Kuzushiji-49. Through this deep learning-based technology, it is thought to provide a good research base for various researchers who study East Asian and Western history, literature, and culture.

본 논문에서는 고대 및 중세 시대의 일본 문자에 대한 데이터세트인 Kuzushiji-MNIST와 Kuzushiji-49를 정확하게 분류하기 위한 딥러닝 학습 방법에 대해서 제안한다. 최신의 합성곱 신경망 네트워크들을 분석하여 가장 적합한 네트워크를 선별하고, 이 네트워크를 이용하여 Kuzushiji-MNIST와 Kuzushiji-49 데이터세트를 분류하기 위한 학습 횟수를 선정한다. 또한 Mixup과 Random Erase 등의 학습 방법을 적용하여 높은 정확도를 갖도록 학습을 진행한다. 학습 결과를 살펴보면 MNIST에 대해서는 99.75%, K-MNIST에 대해서는 99.07%, 그리고 K-49에 대해서는 97.56%의 정확도를 보임으로써 제안한 학습 방법이 높은 성능을 보일 수 있음을 증명하였다. 이와 같은 딥러닝 기반의 기술을 통해 동아시아와 서양의 역사, 문학, 그리고 문화를 연구하는 다양한 연구자들에게 좋은 연구 기반을 제공할 것으로 사료된다.

Keywords

References

  1. T. Clanuwat, M. Bober-Irizar, A. Kitamoto, A. Lamb, K. Yamamoto, and D. Ha. "Deep Learning for Classical Japanese Literature," arXiv preprint arXiv:1812.01718v1, 2018.
  2. Y. Hashimoto, Y. Iikura, Y. Hisada, S. Kang, T. Arisawa, and D. Kobayashi-Better. (2017, November). The Kuzushiji Project: Developing a Mobile Learning Application for Reading Early Modern Japanese Texts. DHQ: Digital Humanities Quarterly [Internet]. 11(1), pp. 1-13. Available: http://dh2016.adho.org/static/data/254.html.
  3. K. Takashiro. (2013, March). Notation of the Japanese Syllabary seen in the Textbook of the Meiji first Year. The bulletin of Jissen Women's Junior College [Internet]. pp. 34:109-119. Available: https://ci.nii.ac.jp/els/contents110009587135.pdf?id=ART0010042265.
  4. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, pp. 1097-1105, Jan. 2012.
  5. K. Simonyan, and A. Zisserman. "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
  6. M. Lin, Q. Chen, and S. Yan. "Network in network," arXiv preprint arXiv:1312.4400, 2013.
  7. L. Chen, G. Papandreou, F. Schroff, and H. Adam. "Rethinking atrous convolution for semantic image segmentation," arXiv preprint arXiv:1706.05587, 2017.
  8. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," in Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 2818-2826, 2016.
  9. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 770-778, 2016.
  10. G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 2261-2269. 2017.
  11. B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. "Learning transferable architectures for scalable image recognition," arXiv preprint arXiv:1707.07012, 2017.
  12. T. He, Z. Zhang, H. Zhang, Z. Zhang, J. Xie, and M. Li, "Bag of Tricks for Image Classification with Convolutional Neural Networks," in Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 558-567, 2019.
  13. C. for Open Data in the Humanities. Kuzushiji dataset [Internet]. Available: http://codh.rois.ac.jp/char-shape/.
  14. Y. LeCun. The MNIST database of handwritten digits [Internet]. Available: http://yann.lecun.com/exdb/mnist/.
  15. H. Xiao, K. Rasul, and R. Vollgraf. "Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms," arXiv preprint arXiv:1708.07747, 2017.
  16. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
  17. H.-T. Zheng, N. Ma, X. Zhang, and J. Sun. "Shufflenet v2: Practical guidelines for efficient cnn architecture design," arXiv preprint arXiv:1807.11164, 2018.
  18. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 770-778, 2016.
  19. H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. "mixup: Beyond Empirical Risk Minimization," arXiv preprint arXiv:1710.09412v2, 2018.
  20. Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang. "Random Erasing Data Augmentation," arXiv preprint arXiv: 1708.04896v2, 2017.
  21. V. Verma, A. Lamb, C. Beckham, A. Najafi, A. Courville, I. Mitliagkas, and Y. Bengio. "Manifold Mixup: Learning Better Representations by Interpolating Hidden States," arXiv preprint arXiv:1806.05236, 2018.
  22. S. Bubeck, and U. V. Luxburg, "Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions," Journal of Machine Learning Research, vol. 10, pp. 657-698, Mar. 2009.
  23. C. Chang, S. Chou. (2015, June). Tuning of the hyperparameters for L2-loss SVMs with the RBF kernel by the maximum-margin principle and the jackknife technique. Pattern Recognition [Internet]. 48(12), pp. 3983-3992. Available: https://doi.org/10.1016/j.patcog.2015.06.017.
  24. ROIS-DS Center for Open Data in the Humanities. Keras Simple CNN Benchmark [Internet]. Available: https://github.com/rois-codh/kmnist/blob/master/benchmarks/kuzushiji_mnist_cnn.py.
  25. K. He, X. Zhang, S. Ren, and J. Sun, "Identity mappings in deep residual networks," in European conference on computer vision, Springer, vol. 9, no. 4, pp. 630-645, 2016.
  26. ROIS-DS Center for Open Data in the Humanities. Benchmarks & Results [Internet]. Available: https://github.com/rois-codh/kmnist.