Implementation of MNIST classification CNN with zero-skipping

Han, Seong-hyeon;Jung, Jun-mo;

doi:10.7471/ikeee.2018.22.4.1238

Journal of IKEEE (전기전자학회논문지)

Volume 22 Issue 4
/
Pages.1238-1241
/
2018
/
1226-7244(pISSN)
/
2288-243X(eISSN)

Institute of Korean Electrical and Electronics Engineers (한국전기전자학회)

DOI QR Code

Implementation of MNIST classification CNN with zero-skipping

Zero-skipping을 적용한 MNIST 분류 CNN 구현

Han, Seong-hyeon (Dept. of Computer Engineering, Seokyeong University) ;
Jung, Jun-mo (Dept. of Electronic Engineering, Seokyeong University)

한성현 ;
정준모

Received : 2018.12.11
Accepted : 2018.12.17
Published : 2018.12.31

https://doi.org/10.7471/ikeee.2018.22.4.1238 Citation PDF KSCI HTML

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, MNIST classification CNN with zero skipping is implemented. Activation of CNN results in 30% to 40% zero. Since 0 does not affect the MAC operation, skipping 0 through a branch can improve performance. However, at the convolution layer, skipping over a branch causes a performance degradation. Accordingly, in the convolution layer, an operation is skipped by giving a NOP that does not affect the operation. Fully connected layer is skipped through the branch. We have seen performance improvements of about 1.5 times that of existing CNN.

본 논문에서는 zero-skipping을 적용한 MNIST 분류 CNN을 구현했다. CNN의 activation에서 0이 30~40% 나오고, 0은 MAC 연산에 영향을 끼치지 않기 때문에 0을 branch를 통해 skip하게 되면 성능 향상을 시킬 수 있다. 그러나 컨볼루션 레이어에서는 branch를 통해 skip하게 되면 성능 하락이 발생한다. 그에 따라 컨볼루션 레이어에서는 연산의 영향을 미치지 않는 NOP을 주어 연산을 skip하고 풀리 커넥티드 레이어에서는 branch를 통해 skip했다. 기존의 CNN보다 약 1.5배의 성능 향상을 확인했다.

Keywords

JGGJB@_2018_v22n4_1238_f0001.png 이미지

Fig. 1. GPU Architecture. 그림 1. GPU 아키텍처

JGGJB@_2018_v22n4_1238_f0002.png 이미지

Fig. 3. Thread allocation of the fully connected layer. 그림 3. 풀리 커넥티드 레이어의 스레드 할당

JGGJB@_2018_v22n4_1238_f0003.png 이미지

Fig. 2. Thread allocation of the convolution layer. 그림 2. 컨볼루션 레이어의 스레드 할당

JGGJB@_2018_v22n4_1238_f0004.png 이미지

Fig. 3. Branch divergence in the convolution layer. 그림 3. 컨볼루션 레이어에서의 branch divergence

Table 1. MNIST CNN architecture. 표 1. MNIST 분류 CNN 아키텍처

JGGJB@_2018_v22n4_1238_t0001.png 이미지

Table 2. Processing time of conventional CNN, zero-skipping CNN. 표 2. 기존의 CNN, zero-skipping CNN의 연산 시간 비교

JGGJB@_2018_v22n4_1238_t0002.png 이미지

Table 3. Processing time comparison. 표 3. 연산 시간 비교

JGGJB@_2018_v22n4_1238_t0003.png 이미지

References

Kwanho Lee, "A Design of a SIMT architecutre based GP-GPU for parallel acceleration of algorithms," Master thesis, Seokyeong University, 2017.
SeongHyun Han, Kwang-Yeob Lee "GPGPU performance enhancement through master scheduler design with priority," 2018 IKEEE Summer Conference. 2018.
Albericio, Jorge, et al. "Cnvlutin: Ineffectualneuron-free deep neural network computing," ACM SIGARCH Computer Architecture News. Vol. 44. No. 3. IEEE Press, 2016.
Sang-il Lee, Jun-Mo Jung, Kwang-Yeob Lee, "Implementation of Numerical CNN using GPGPU," 2017 IKEEE Summer Conference, 2017.

Journal of IKEEE (전기전자학회논문지)

Implementation of MNIST classification CNN with zero-skipping

Zero-skipping을 적용한 MNIST 분류 CNN 구현

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)