This work was supported by the National Research Foundation of Korea (NRF-2017R1E1A1A03070311)
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas: NV, pp. 770-778, 2016.
- I. Goodfellow, Y. Bengio, and A. Courville, "Regularization for deep learning," in Deep Learning, Cambridge : MIT Press, 2016.
- R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, "Grad-CAM: Visual explanations from deep networks via gradient-based localization," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 618-626, 2017.
- M. W. Huang, C. W. Chen, W. C. Lin, S. W. Ke, and C. F. Tsai, "Svm and svm ensembles in breast cancer prediction," PLoS ONE, vol. 12, no. 1, pp. 1-14, Jan. 2017.
- A. Ng. The most frequently themes(AItimes) [Internet]. Available: http://www.aitimes.com/news/articleView.html?idxno=131542.
- B. Neyshabur, S. Bhojanapalli, D. McAllester, and N. Srebro, "Exploring Generalization in Deep Learning," in the 31st Conference on Neural Information Processing Systems (NIPS), Long Beach: CA, 2017.
- Machine Learning Mastery. A Gentle Introduction to the Rectified Linear Unit (ReLU) [Internet]. Available: https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks/?nowprocket=1.
- D. X. Zhou, "Deep distributed convolutional neural networks: Universality," Analysis and Applications, vol. 16, no. 6, pp. 895-919, 2018. https://doi.org/10.1142/s0219530518500124
- L. Serge, Algebra, New York, NY: Springer-Verlag, 2002.
- A. Krizhevsky, H. Suskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, Jun. 2017. https://doi.org/10.1145/3065386
- C. M. Bishop, Pattern Recognition and Machine Learning, New York, NY: Springer-Verlag, 2006.
- M. Kirby, Geometric Data Analysis, New York, NY: Wiley-Interscience, 2001.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. https://doi.org/10.1109/5.726791
- G. Montavon, S. Bach, A. Binder, W. Samek, and K. Muller, "Explaining nonlinear classification decisions with deep taylor decomposition," Pattern Recognition, vol. 65, pp. 211-222, May. 2017. https://doi.org/10.1016/j.patcog.2016.11.008
- M. J. Kochenderfer and T. A. Wheeler, Algorithms for Optimization, Cambridge: The MIT Press, 2019.
- H. Zulkifli, Understanding Learning Rates and How It Improves Performance in Deep Learning. Towards Data Science [Internet]. Available: https://towardsdatascience.com/understanding-learning-rates-and-how-it-improves-performance-in-deep-learning-d0d4059c1c10.
- S. Lau. Learning Rate Schedules and Adaptive Learning Rate Methods for Deep Learning. Towards Data Science [Internet]. Available: https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1.
- G. Aurelien, Hands-On Machine Learning with Scikit-Learn and TensorFlow, O'Reilly Media Inc, 2017.