DOI QR코드

DOI QR Code

Implementation of CNN in the view of mini-batch DNN training for efficient second order optimization

효과적인 2차 최적화 적용을 위한 Minibatch 단위 DNN 훈련 관점에서의 CNN 구현

Song, Hwa Jeon;Jung, Ho Young;Park, Jeon Gue
송화전;정호영;박전규

  • Received : 2016.04.28
  • Accepted : 2016.06.22
  • Published : 2016.06.30

Abstract

This paper describes some implementation schemes of CNN in view of mini-batch DNN training for efficient second order optimization. This uses same procedure updating parameters of DNN to train parameters of CNN by simply arranging an input image as a sequence of local patches, which is actually equivalent with mini-batch DNN training. Through this conversion, second order optimization providing higher performance can be simply conducted to train the parameters of CNN. In both results of image recognition on MNIST DB and syllable automatic speech recognition, our proposed scheme for CNN implementation shows better performance than one based on DNN.

Keywords

automatic speech recognition;DNN;CNN;second order optimization

References

  1. Abdel-Hamid, O., Mohamed, A., Jiang, H., Deng, L., Penn, G., & Yu, D. (2014). Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio, Speech, And Language Processing, 22(10), 1533-1545. https://doi.org/10.1109/TASLP.2014.2339736
  2. Sak, H., Senior, A., & Beaufays, F. (2014). Long short-term recurrent neural network architectures for large scale acoustic modeling. Interspeech 2014 (pp. 338-342).
  3. Chellapilla, K., Puri, S., & Simard, P. (2006). High performance convolutional neural networks for document processing. Proceedings of International Workshop on Frontiers in Handwriting Recognition.
  4. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM International Conference on Multimedia (pp. 675-678).
  5. Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., & Tran, J. (2014). cuDNN: efficient primitives for deep learning. Retrieved from http://arxiv.org/abs/1410.0759 [Computing Research Repository] on April 15, 2016.
  6. Ren, J. & Xu, L. (2015). On vectorization of deep convolutional neural networks for vision tasks, Proceedings of the 29th AAAI Conference on Artificial Intelligence (pp. 1840-1846).
  7. Song, H. J., Jung, H. Y., & Park, J. G. (2015). A study of CNN training based on various filter structures and feature normalization methods. Proceedings 2015 International Conference on Speech Sciences (pp. 243-244).
  8. Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. https://doi.org/10.1109/5.726791
  9. Amari, S. (1998). Natural gradient works efficiently in learning. Neural Computation, 10, 251-276. https://doi.org/10.1162/089976698300017746
  10. Povey, D., Zhang, X., & Khudanpur, S. (2015). Parallel training of DNNs with natural gradient and parameter averaging. Proceedings of International Conference on Learning Representations 2015.
  11. Song, H. J., Jung, H. Y., & Park, J. G. (2015). A study of DNN training based on various pretraining approaches. Proceedings of the 2015 Spring Conference of the Korean Society of Speech Sciences (pp. 169-170). (송화전.정호영.박전규 (2015). 다양한 Pretraining 방법에 따른 DNN 훈련 방법에 대한 고찰. 한국음성학회 2015 봄학술대회 논문집, 169-170.)
  12. Rodrigo Benenson. (2013-2016). MNIST. Retrieved from http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html on Apil 15, 2016.
  13. Google. (2015). Tensorflow. Retrieved from https://www.tensorflow.org/ on April 15, 2016.

Acknowledgement

Grant : 언어학습을 위한 자유발화형 음성대화처리 원천기술 개발

Supported by : 정보통신기술진흥센터