A review and comparison of convolution neural network models under a unified framework

Park, Jimin;Jung, Yoonsuh;

doi:10.29220/CSAM.2022.29.2.161

Communications for Statistical Applications and Methods

제29권2호
/
Pages.161-176
/
2022
/
2287-7843(pISSN)
/
2383-4757(eISSN)

한국통계학회 (The Korean Statistical Society)

DOI QR Code

A review and comparison of convolution neural network models under a unified framework

Park, Jimin (Memory Business, Samsung Electronics) ;
Jung, Yoonsuh (Department of Statistics, Korea University)

투고 : 2021.08.11
심사 : 2021.12.08
발행 : 2022.03.31

https://doi.org/10.29220/CSAM.2022.29.2.161 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

There has been active research in image classification using deep learning convolutional neural network (CNN) models. ImageNet large-scale visual recognition challenge (ILSVRC) (2010-2017) was one of the most important competitions that boosted the development of efficient deep learning algorithms. This paper introduces and compares six monumental models that achieved high prediction accuracy in ILSVRC. First, we provide a review of the models to illustrate their unique structure and characteristics of the models. We then compare those models under a unified framework. For this reason, additional devices that are not crucial to the structure are excluded. Four popular data sets with different characteristics are then considered to measure the prediction accuracy. By investigating the characteristics of the data sets and the models being compared, we provide some insight into the architectural features of the models.

키워드

과제정보

Yoonsuh Jung's work was partially supported by National Research Foundation of Korea (NRF) grant funded by Korea government (MIST)(2019R1A4A1028134 and 2021R1F1A1062347).

참고문헌

Chollet F (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1251-1258.
Deng J, Dong W, Socher R, Li LJ, Li K, and Fei-Fei L (2009). ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.
He K, Zhang X, Ren S, and Sun J (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, and Adam H (2017). Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications, arXiv preprint arXiv:1704.04861
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, and Keutzer K (2016). SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and < 0.5 MB Model Size, arXiv preprint arXiv:1602.07360
Ioffe S and Szegedy C (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. In JMLR Workshop and Conference Proceedings, 37, 448-456.
Krizhevsky A, Nair V, and Hinton G (2014). The cifar-10 dataset, http://www.cs.toronto.edu/kriz/cifar
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, and Jackel LD (1989). Backpropagation applied to handwritten zip code recognition, Neural computation, 1, 541-551. https://doi.org/10.1162/neco.1989.1.4.541
Nair V and Hinton GE (2010). Rectified linear units improve restricted boltzmann machines. In ICML'10: Proceedings of the 27th International Conference on International Conference on Machine Learning.
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, and Ng AY (2011). Reading digits in natural images with unsupervised feature learning. In Advances in Neural Information Processing Systems (NIPS).
Mukkamala MC and Hein M (2017). Variants of RMSP rop and a dagrad with logarithmic regret bounds. In Proceedings of the 34th International Conference on Machine Learning, 70, 2545-2553.
Scherer D, Muller A, and Behnke S (2010). Evaluation of pooling operations in convolutional architectures for object recognition, International Conference on Artificial Neural Networks, 92-101.
Simonyan K and Zisserman A (2014). Very deep convolutional networks for large-scale image recognition, Computer Vision and Pattern Recognition, arXiv:1409.1556
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, and Salakhutdinov R (2014). Dropout: a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15, 1929-1958.
Szegedy C, Liu W, Jia Y, et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, andWojna Z (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2818-2826.
Wei W, Yiyang H, Ting Z, Hongmei L, Jin W, and Xin W (2020). A new image classification approach via improved mobilenet models with local receptive field expansion in shallow layers, Computational Intelligence and Neuroscience.
Xiao H, Rasul K, and Vollgraf R (2017). Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms, arXiv:1708.07747
Zagoruyko S and Komodakis N (2016). Wide residual networks. In Proceedings of the British Machine Vision Conference (BMVC), 87, 12.
Zhang X, Zhou X, Lin M, and Sun J (2018). Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition, 6848-6856.
Zoph B and Le VQ (2016). Neural architecture search with reinforcement learning, In CoRR.
Zoph B, Vasudevan V, Shlens J, and Le QV (2018). Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8697-8710.

Communications for Statistical Applications and Methods

A review and comparison of convolution neural network models under a unified framework

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)