AB9: A neural processor for inference acceleration

Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu;

doi:10.4218/etrij.2020-0134

ETRI Journal

제42권4호
/
Pages.491-504
/
2020
/
1225-6463(pISSN)
/
2233-7326(eISSN)

한국전자통신연구원 (Electronics and Telecommunications Research Institute)

DOI QR Code

AB9: A neural processor for inference acceleration

Cho, Yong Cheol Peter (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Chung, Jaehoon (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Yang, Jeongmin (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Lyuh, Chun-Gi (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Kim, HyunMi (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Kim, Chan (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Ham, Je-seok (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Choi, Minseok (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Shin, Kyoungseon (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Han, Jinho (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) ;
Kwon, Youngsu (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute)

투고 : 2020.04.16
심사 : 2020.07.02
발행 : 2020.08.18

https://doi.org/10.4218/etrij.2020-0134 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.

키워드

참고문헌

ETRI Technology, Aldebaran microcontroller SoC for mobile robot (low power MCU core technology), 2017, available at https://www.etri.re.kr/eng/bbs/view.etri?b_board_id=ENG03&b_idx=16719
J. Han et al., A 1GHz fault tolerant processor with dynamic lockstep and self-recovering cache for ADAS SoC complying with ISO26262 in automotive electronics, in Proc. IEEE Asian Solid-State Circuits Conf. (Seoul, Rep. of Korea), Nov. 2017, pp. 313-316.
Y. Jia, Learning semantic image representations at a large scale, Ph.D. Thesis, EECS Department, Univ. of California, Berkeley, May 2014.
S. Gupta et al., Deep learning with limited numerical precision, Int. Conf. Mach. Learn. 37 (2015), 1737-1746.
J. Redmon and A. Farhadi, Yolo9000: Better, faster, stronger, 2016, available at https://arxiv.org/abs/1612.08242, preprint.
J. Kim, J. K. Lee, and K. M. Lee, Accurate image super-resolution using very deep convolutional networks, in Proc. IEEE Conf. Comput. Vision Pattern Recognit. (Las Vegas, NV, USA), 2016, pp. 1646-1654.
A. Ignatov et al., AI benchmark: All about deep learning on smartphones in 2019, in Proc. IEEE/CVF Int. Conf. Comput. Vision Workshop (Seoul, Rep. of Korea), Oct. 2019, pp. 3617-3635.
AI-Benchmark, available at http://www.ai-bench mark.com
J. Johnson. Benchmarks for popular CNN models, available at https://github.com/jcjoh nson/cnn-bench marks
Coral, Edge TPU performance benchmarks, available at https://coral.ai/docs/edget pu/benchmarks/
T. Narayan and Intel AI Academy, A comparison of performance of deep learning models on Edge using Intel Movidius Neural Compute Stick and Raspberry PI3, available at https://medium.com/intel-student-ambassadors/object-detection-a-comparison-of-performance-of-deep-learning-models-on-edge-using-intel-f66eb7f45b17
S. Hossain and D. Lee, Deep learning-based real-time multiple-object detection and tracking from aerial imagery via a flying robot with GPU-based embedded devices, Sensors 19 (2019), no. 15, 3371:1-3424.
J. Guerreiro et al., Modeling and decoupling the GPU power consumption for cross-domain DVFS, IEEE Trans. Parallel Distrib. Syst. 30 (2019), no. 11, 2494-2506. https://doi.org/10.1109/TPDS.2019.2917181

피인용 문헌

인공지능 프로세서 컴파일러 개발 동향 vol.36, pp.2, 2020, https://doi.org/10.22648/etri.2021.j.360204

ETRI Journal

AB9: A neural processor for inference acceleration

초록

키워드

참고문헌

피인용 문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)