DOI QR코드

DOI QR Code

Performance Improvement and Power Consumption Reduction of an Embedded RISC Core

  • Jung, Hong-Kyun (Department of Information and Communication Engineering, Hanbat National University) ;
  • Jin, Xianzhe (Department of Information and Communication Engineering, Hanbat National University) ;
  • Ryoo, Kwang-Ki (Department of Information and Communication Engineering, Hanbat National University)
  • Received : 2011.09.01
  • Accepted : 2011.10.20
  • Published : 2012.03.31

Abstract

This paper presents a branch prediction algorithm and a 4-way set-associative cache for performance improvement of an embedded RISC core and a clock-gating algorithm with observability don’t care (ODC) operation to reduce the power consumption of the core. The branch prediction algorithm has a structure using a branch target buffer (BTB) and 4-way set associative cache that has a lower miss rate than a direct-mapped cache. Pseudo-least recently used (LRU) policy is used for reducing the number of LRU bits. The clock-gating algorithm reduces dynamic power consumption. As a result of estimation of the performance and the dynamic power, the performance of the OpenRISC core applied to the proposed architecture is improved about 29% and the dynamic power of the core with the Chartered 0.18 ${\mu}m$ technology library is reduced by 16%.

Keywords

References

  1. J. Nurmi, Processor design: system-on-chip computing for ASICs and FPGAs, Dordrecht: Springer, 2007.
  2. J. Balfour, W. Dally, D. Black-Schaffer, V. Parikh, and J Park, "An energy-efficient processor architecture for embedded systems," IEEE Computer Architecture Letters, vol. 7, no. 1, pp. 29-32, 2008. https://doi.org/10.1109/L-CA.2008.1
  3. S. AbdelHak, A. Sil, Y. Wang, N. F. Tzeng, and M. Bayoumi, "Reducing misprediction penalty in the branch target buffer," Proceedings of the 50th Midwest Symposium on Circuits and Systems, Montreal, pp. 1102-1105, 2007.
  4. J. Hoogerbrugge, "Dynamic branch prediction for a VLIW processor," Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques, Philadelphia, p. 207-214, 2000.
  5. C. Zhang, "Balanced cache: reducing conflict misses of directmapped caches," Proceedings of the 33rd Annual International Symposium on Computer Architecture, Boston, p. 155-166, 2006.
  6. P. R. Panda, B. V. N. Silpa, A. Shrivastava, and K. Gummidipudi, Power-efficient system design, New York: Springer, 2010.
  7. P. Babighian, L. Benini, and E. Macii, "A scalable ODC-based algorithm for RTL insertion of gated clocks," Proceedings of the Conference on Design, Automation and Test in Europe, Paris, p.500-505, 2004.
  8. D. Lampret, OpenRISC 1200 IP core specification [Internet]. Available from: http://opencores.org/openrisc,or1200.
  9. D. Lampret, OpenRISC 1000 architecture [Internet]. Available from: http://opencores.org/openrisc,architecture.
  10. K. Kedzierski, M. Moreto, F. J. Cazorla, and M. Valero, "Adapting cache partitioning algorithms to pseudo-LRU replacement policies," Proceedings of 2010 IEEE International Symposium on Parallel & Distributed Processing, Atlanta, pp. 1- 12, 2010.
  11. S. Roy, "H-NMRU: A low area, high performance cache replacement policy for embedded processors," Proceedings of the 22nd International Conference on VLSI Design, New Delhi, pp. 553-558, 2009.
  12. H. M. Yang, "Design of a low-power branch predictor for embedded processors," master's thesis, Yonsei University, Seoul, 2005.
  13. J. C. Lee, "A study on economical branch target buffer design," master's thesis, Soonchunhyang University, Asan, 2006.
  14. R. Khanna, S. Verma, R. Biswas, and J. B. Singh, "Implementation of branch delay in superscalar processors by reducing branch penalties," Proceedings of IEEE 2nd International Advance Computing Conference, Patiala, pp. 14-20, 2010.
  15. W. Jin, J. Dong, K. Lu, and Y. Li, "The study of hierarchical branch prediction architecture," Proceedings of IEEE 14th International Conference on Computational Science and Engineering, Dalian, China, pp. 16-20, 2011.
  16. C. Piguet, Low-power CMOS circuits: technology, logic design and CAD tools, Boca Raton: CRC Press, 2006.

Cited by

  1. OpenRISC 프로세서를 위한 압축 명령어 집합 구조 vol.17, pp.10, 2012, https://doi.org/10.9708/jksci/2012.17.10.011