Leakage Energy Management Techniques via Shared L2 Cache Partitioning

캐시 파티션을 이용한 공유 2차 캐시 누설 에너지 관리 기법

  • 강희준 (서울대학교 컴퓨터공학과) ;
  • 김현희 (서울대학교 컴퓨터공학과) ;
  • 김지홍 (서울대학교 컴퓨터공학과)
  • Published : 2010.02.15

Abstract

The existing timeout based cache leakage management techniques reduce the leakage energy consumption of the cache significantly by switching off the power supply to the inactive cache line. Since these techniques were mainly proposed for single-processor systems, their efficiency is reduced significantly in multiprocessor systems with a shared L2 cache because of the cache interferences among simultaneously executing tasks. In this paper, we propose a novel cache partition strategy which partitions the shared L2 cache considering the inactive cycles of the cache line. Furthermore, we propose the adaptive task-aware timeout management technique which considers the characteristics of each task and adapts the timeout dynamically. Experimental results from the simulation show that the proposed technique reduces the leakage energy consumption of the shared L2 cache by 73% for the 2-way CMP and 56% for the 4-way CMP on average compared to the existing representative leakage management technique, respectively.

기존의 타임아웃 기반 캐시 누설 에너지 관리 기법들은 한동안 사용되지 않은 비활성화 상태의 캐시 라인의 전력 공급을 끊음으로써 누설 에너지 소모를 줄인다. 그러나, 이들 기법들은 단일 프로세서 환경에 적합하게 고안되었기 때문에, 태스크들 간의 간섭이 빈번히 발생하는 공유 2차 캐시를 사용하는 멀티프로세서 환경에서는 에너지 감소를 방해한다. 본 논문에서는 캐시 라인 비활성화 시간을 고려한 캐시 파티션 전략을 통해 캐시 간섭을 줄임으로써 멀티프로세서 환경의 공유 2차 캐시에서의 누설 에너지 감소 효과를 증가시키기 위한 기법을 제안한다. 또한, 각 태스크들의 특성을 고려하여 타임아웃을 설정하는 적응형 타임아웃 관리 기법을 통해 캐시 누설 에너지 소비를 감소시키는 기법을 제안한다. 시뮬레이션을 통한 실험 결과에서 기존의 기법과 비교하여 2-way CMP에서는 평균 73%, 4-way CMP에서는 평균 56% 정도의 누설 에너지 소비가 줄어드는 것을 확인하였다.

Keywords

References

  1. S. Kaxiras, Z. Hu, and M. Martonosi, "Cache decay: exploiting generational behavior to reduce cache leakage power," Proc. of ISCA, pp.240-251, 2001.
  2. Y. Solihin, F. Guo, and S. Kim, "Predicting cache space contention in utility computing servers," Proc. of IPDPS, pp.8-15, 2005.
  3. M. D. Powell, S. Yang, B. Falsafi, K. Roy, and T. B. Vijaykumar, "Reducing leakage in a highperformance deep-submicron instruction cache," IEEE Trans. on VLSI, vol.9, no.1, pp.77-89, 2001. https://doi.org/10.1109/92.920821
  4. M. Qureshi and Y. Patt, "Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches," Proc. of MICRO, pp.423-432, 2006.
  5. N. Rafique, W-T. Lim, and M. Thottethodi, "Architectural support for operating system-driven CMP cache management," Proc. of PACT, pp.2-12, 2006.
  6. D. Chandra, F. Guo, S. Kim, and Y. Solihin, "Predicting inter-thread cache contention on a chip multi-processor architecture," Proc. of HPCA, pp.340-351, 2005.
  7. G. E. Suh, L. Rudolph, and S. Devadas, "Dynamic partitioning of shared cache memory," Journal of Supercomputing, vol.28, no.1, pp.7-26, 2004. https://doi.org/10.1023/B:SUPE.0000014800.27383.8f
  8. T. Y. Yeh and G. Reinman, "Fast and fair: data-stream quality of service," Proc. of CASES, pp.237-248, 2005.
  9. H. Kim, S. Youn, and J. Kim, "A Leakage-Aware Cache Sharing Technique for Low-Power Chip Multi-processors (CMPs) with Private L2 Caches," Proc. of MEDEA, pp.30-37, 2008.
  10. H. Kim and J. Kim, "A Leakage-Aware L2 Cache Management Technique for Producer-Consumer Sharing in Low-Power Chip Multiprocessors," Proc. of COOL Chips XII, pp.437-450, 2009.
  11. J. Park and L. Choi, "A Preliminary Study on a Cache Coherence Protocol for Multi-Core Processors with Ring Interconnects," Journal of KIISE : Computing Practices and Letters, vol.14, no.8, pp. 768-772, 2008.
  12. SPEC CPU2000 benchmark, http://www.spec.org/ cpu2000/.
  13. D. Kim, S. Ha, and R. Gupta, "CATS: cycle accurate transaction-driven simulation with multiple processor simulators," Proc. of DATE, pp.749-754, 2007.
  14. D. Tarian, S. Thoziyoor, and N. Jouppi. "CACTI: an integrated cache access time, cycle time, area, leakage, and dynamic power model," http://www.hpl. hp.com/personal/Norman_Jouppi/cacti4.html
  15. H. Zhou, M. C. Toburen, E. Rotenberg, and T. M. Conte. "Adaptive mode control: a static-powerefficient cache design, In ACM Trans. on Embedded Computing Systems," vol.2, no.3, pp.347-372, 2003. https://doi.org/10.1145/860176.860181