DOI QR코드

DOI QR Code

Dead Block-Aware Adaptive Write Scheme for MLC STT-MRAM Caches

  • Hong, Seokin (School of Computer Science and Engineering, Kyungpook National University)
  • Received : 2020.02.11
  • Accepted : 2020.03.03
  • Published : 2020.03.31

Abstract

In this paper, we propose an efficient adaptive write scheme that improves the performance of write operation in MLC STT-MRAM caches. The key idea of the proposed scheme is to perform the write operation fast if the target MLC STT-MRAM cells contain a dead block. Even if the fast write operation on the MLC STT-MRAM evicts a cache block from the MLC STT-MRAM cells, its performance impact is low if the evicted block is a dead block which is not used in the future. Through experimental evaluation with a memory simulator, we show that the proposed adaptive write scheme improves the performance of the MLC STT-MRAM caches by 17% on average.

본 논문에서는 MLC STT-MRAM 캐시 메모리의 쓰기 동작 성능을 향상시킬 수 있는 효율적인 쓰기 기법을 제안한다. 제안하는 기법의 핵심 아이디어는 MLC STT-MRAM에 저장된 캐시 블록이 데드 블록 (Dead block)일 경우 쓰기 동작을 빠르게 수행하는 것이다. 이러한 빠른 쓰기 동작은 MLC STT-MRAM에 저장된 캐시 블록을 제거할 수 있지만, 제거된 블록이 앞으로 사용되지 않는 데드 블록일 경우에는 시스템 성능에 미치는 영향이 매우 작다. 메모리 시뮬레이터를 사용한 실험 평가를 통해 본 논문에서 제안하는 쓰기 기법이 MLC STT-MRAM 캐시의 성능을 평균 17% 향상시킬 수 있음을 보인다.

Keywords

References

  1. Suock Chung et al., "Fully integrated 54nm STT-RAM with the smallest bit cell dimension for high density memory application," Proceedings of 2010 International Electron Devices Meeting, pp. 12.7.1-12.7.4., San Francisco, USA, 2010. DOI: 10.1109/IEDM.2010.5703351
  2. T. Ishigaki, T. Kawahara, R. Takemura, K. Ono, K. Ito, H. Matsuoka, and H. Ohno, "A multi-level-cell spin-transfer torque memory with series-stacked magnetotunnel junctions," Proceedings of 2010 Symposium on VLSI Technology, pp. 47-48, Honolulu, USA, 2010. DOI: 10.1109/VLSIT.2010.5556126
  3. Lei Jiang, Bo Zhao, Youtao Zhang, and Jun Yang. 2012. "Constructing large and fast multi-level cell STT-MRAM based cache for embedded processors," Proceedings of the 49th Annual Design Automation Conference, pp. 907-912, New York, USA, 2012. DOI: https://doi.org/10.1145/2228360.2228521
  4. Y. Chen, X. Wang, W. Zhu, H. Li, Z. Sun, G. Sun, and Y. Xie, "Access scheme of Multi-Level Cell Spin-Transfer Torque Random Access Memory and its optimization," Proceedings of 53rd IEEE International Midwest Symposium on Circuits and Systems, pp. 1109-1112, Seattle, USA, 2010. DOI: 10.1109/MWSCAS.2010.5548848
  5. X. Bi, M. Mao, D. Wang and H. Li, "Unleashing the potential of MLC STT-RAM caches," Proceedings of 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 429-436, San Jose, USA, 2013. DOI: 10.1109/ICCAD.2013.6691153
  6. J. Wang, P. Roy, W. Wong, X. Bi and H. Li, "Optimizing MLC-based STT-RAM caches by dynamic block size reconfiguration," Proceedings of 2014 IEEE 32nd International Conference on Computer Design (ICCD), pp. 133-138, Seoul, South Korea, 2014. DOI: 10.1109/ICCD.2014.6974672
  7. S. Hong, J. Lee and S. Kim, "Ternary cache: Three-valued MLC STT-RAM caches," Proceedings of 2014 IEEE 32nd International Conference on Computer Design (ICCD), pp. 83-89, Seoul, South Korea, 2014. DOI: 10.1109/ICCD.2014.6974666
  8. J. Xu, D. Feng, W. Tong, J. Liu and W. Zhou, "Encoding Separately: An Energy-Efficient Write Scheme for MLC STT-RAM," Proceedings of 2017 IEEE International Conference on Computer Design (ICCD), pp. 581-584, Boston, MA, USA, 2017. DOI: 10.1109/ICCD.2017.100
  9. X. Bi, M. Mao, D. Wang and H. H. Li, "Cross-Layer Optimization for Multilevel Cell STT-RAM Caches," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 6, pp. 1807-1820, June 2017. DOI: 10.1109/TVLSI.2017.2665543
  10. M. A. Qureshi, H. Kim and S. Kim, "A Restore-Free Mode for MLC STT-RAM Caches," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 6, pp. 1465-1469, June 2019. DOI: 10.1109/TVLSI.2019.2899894
  11. S. M. Khan, Y. Tian and D. A. Jimenez, "Sampling Dead Block Prediction for Last-Level Caches," Proceedings of 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 175-186, Atlanta, USA, 2010. DOI: 10.1109/MICRO.2010.24
  12. An-Chow Lai, C. Fide and B. Falsafi, "Dead-block prediction & dead-block correlating prefetchers," Proceedings of 28th Annual International Symposium on Computer Architecture, pp. 144-154, Goteborg, Sweden, 2001. DOI: 10.1109/ISCA.2001.937443
  13. Haiming Liu, Michael Ferdman, Jaehyuk Huh, and Doug Burger, "Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency," Proceedings of the IEEE/ACM International Symposium on Microarchitecture, pp. 222-233, Los Alamitos, USA, 2008. DOI: 10.1109/MICRO.2008.4771793
  14. M. Kharbutli and Y. Solihin, "Counter-Based Cache Replacement and Bypassing Algorithms," IEEE Transactions on Computers, Vol. 57, No. 4, pp. 433-447, April 2008. DOI: 10.1109/TC.2007.70816
  15. Zhigang Hu, S. Kaxiras and M. Martonosi, "Timekeeping techniques for predicting and optimizing memory behavior," Proceedings of 2003 IEEE International Solid-State Circuits Conference, pp. 166-485, San Francisco, CA, USA, 2003. DOI: 10.1109/ISSCC.2003.1234251
  16. Jaume Abella, Antonio Gonzalez, Xavier Vera, and Michael F. P. O'Boyle. 2005. IATAC: a smart predictor to turn-off L2 cache lines. ACM Transactions on Architecture and Code Optimization (TACA), Vol. 2, No. 1, pp. 55-77, March 2005. DOI: https://doi.org/10.1145/1061267.1061271
  17. L. Liu, P. Chi, S. Li, Y. Cheng and Y. Xie, "Building energy-efficient multi-level cell STT-RAM caches with data compression," Proceedings of 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 751-756, Chiba, Japan, 2017. DOI: 10.1109/ASPDAC.2017.7858414
  18. N. Chatterjee et al., USIMM: The Utah Simulated Memory Module, tech. report UUCS-12-002, Univ. of Utah, 2012.