DOI QR코드

DOI QR Code

An Approximate DRAM Architecture for Energy-efficient Deep Learning

  • Received : 2020.06.15
  • Accepted : 2020.06.16
  • Published : 2020.06.30

Abstract

We present an approximate DRAM architecture for energy-efficient deep learning. Our key premise is that by bounding memory errors to non-critical information, we can significantly reduce DRAM refresh energy without compromising recognition accuracy of deep neural networks. To validate the key premise, we make extensive Monte-Carlo simulations for several well-known convolutional neural networks such as LeNet, ConvNet and AlexNet with the input of MINIST, CIFAR-10, and ImageNet, respectively. We assume that the highest-order 8-bits (in single precision) and 4-bits (in half precision) are protected from retention errors under the proposed architecture and then, randomly inject bit-errors to unprotected bits with various bit-error-rates. Here, recognition accuracies of the above convolutional neural networks are successfully maintained up to the 10-5-order bit-error-rate. We simulate DRAM energy during inference of the above convolutional neural networks, where the proposed architecture shows the possibility of considerable energy saving up to 10 ~ 37.5% of total DRAM energy.

Keywords

References

  1. Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, 2015.
  2. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012.
  3. K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," Proceedings of ICLR, 2015.
  4. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions,"in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015.
  5. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
  6. S. Han, X. Liu, H.Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, "Eie: efficient inference engine on compressed deep neural network," in Proceedings of the 43rd International Symposium on Computer Architecture, IEEE Press, 2016.
  7. Y.-D. Kim, E. Park and D. Shin, "Compression of deep convolutional neural networks for fast and low power mobile applications," ICLR, 2016
  8. J. Liu, B. Jaiyen, R. Veras, and O. Mutlu, "Raidr: Retention-aware intelligent dram refresh," in Proceedings of the 39th Annual International Symposium on Computer Architecture, ACM, 2012.
  9. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, 1998.
  10. K. Alex, "Cifar-10 cuda-convnet model," https://code.google.com/archive/p/cuda-convnet/.
  11. Y. LeCun, "The mnist database of handwritten digits," http://yann.lecun.com/exdb/mnist/.
  12. A. Krizhevsky, V. Nair, and G. Hinton, "The cifar-10 dataset," http://www.cs.toronto.edu/kriz/cifar.html, 2014.
  13. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in Computer Vision and Pattern Recognition, IEEE Conference on, 2009.
  14. I. Bhati, Z. Chishti, S.-L. Lu, and B. Jacob, "Flexible auto-refresh: Enabling scalable and energy-efficient dram refresh reductions," in Proceedings of the 42th Annual International Symposium on Computer Architecture, 2015.
  15. J. Liu, B. Jaiyen, Y. Kim, C. Wilkerson, and O. Mutlu, "An experimental study of data retention behavior in modern dram devices: Implications for retention time profilingmechanisms," in Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013.
  16. S. Liu, K. Pattabiraman, T. Moscibroda, and B. G. Zorn, "Flikker:Saving dram refresh-power through critical data partitioning," in Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ser, 2011.
  17. A. Raha, S. Sutar, H. Jayakumar, and V. Raghunathan, "Quality config-urable approximate dram," IEEE Transactions on Computers, vol. 66, 2017.
  18. V. Sze, Y.-H. Chen, T.-J. Yang, and J. Emer, "Efficient process-ing of deep neural networks: A tutorial and survey," arXiv preprintarXiv:1703.09039, 2017
  19. "Jedec, ddr3 sdram specification," 2010
  20. N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashtiet et al., "The gem5 simulator," ACM SIGARCH Computer Architecture News, vol. 39, 2011.
  21. K. Chandrasekar, C. Weis, Y. Li, B. Akesson, N. Wehn, and K. Goossens, "Drampower: Open-source dram power & energy estimation tool," http://www.drampower.info, vol. 22, 2012.
  22. Tiny-DNN, "header only, dependency-free deep learning framework inc++," https://github.com/tiny-dnn.