Fault Tolerant Cache for Soft Error

소프트에러 결함 허용 캐쉬

  • Published : 2008.01.01

Abstract

In this paper, we propose a new cache structure for effective error correction of soft error. We added check bit and SEEB(soft error evaluation block) to evaluate the status of cache line. The SEEB stores result of parity check into the two-bit shit register and set the check bit to '1' when parity check fails twice in the same cache line. In this case the line where parity check fails twice is treated as a vulnerable to soft error. When the data is filled into the cache, the new replacement algorithm is suggested that it can only use the valid block determined by SEEB. This structure prohibits the vulnerable line from being used and contributes to efficient use of cache by the reuse of line where parity check fails only once can be reused. We tried to minimize the side effect of the proposed cache and the experimental results, using SPEC2000 benchmark, showed 3% degradation in hit rate, 15% timing overhead because of parity logic and 2.7% area overhead. But it can be considered as trivial for SEEB because almost tolerant design inevitably adopt this parity method even if there are some overhead. And if only parity logic is used then it can have $5%{\sim}10%$ advantage than ECC logic. By using this proposed cache, the system will be protected from the threat of soft error in cache and the hit rate can be maintained to the level without soft error in the cache.

Keywords

References

  1. S. G. Miremadi, H. R. Zarandi, "Reliability of protecting techniques used in fault-tolerant cache memories,"Electrical and Computer Engineering, 2005. Canadian Conference, pp:820-823, 1-4 May 2005
  2. A. J. Tylka, W. F. Dietrich, P. R. Boberg, P.R.; E. C. Smith, J. H. , Jr. Adams, "Single event upsets caused by solar energetic heavy ions,"Nuclear Science, IEEE Transactions, Volume 43, Issue 6,Part 1,pp:2758 2766, Dec. 1996 https://doi.org/10.1109/23.556863
  3. E. Ibe, H. Kameyama, Y. Yahagi, H. Yamaguchi,"Single event effects as a reliability issue of IT infrastructure," Information Technology and Applications, 2005. ICITA 2005. Third International Conference, Volume 1,pp:555-560 vol.1, 4-7 July 2005
  4. J. M. Benedetto, P. H. Eaton, D. G. Mavis, M. Gadlage, T. Turflinger,"Digital Single Event Transient Trends With Technology Node Scaling,"Nuclear Science, IEEE Transactions, Volume 53,Issue 6,Part 1,pp:3462-3465, Dec. 2006
  5. J. F. Ziegler, et al., "IBM experiments in soft fails in computer electronics (1978-1994)," Terrestrial cosmic rays and soft errors, IBM Journal of Research and Development Volume 40, Number 1, pp. 3-18, 1998
  6. O. A. Amusan, A. L. Steinberg, A. F. Witulski, B. L. Bhuva, J. D. Black, M. P. Baze, L. W. Massengill, "Single Event Upsets in a 130 nm Hardened Latch Design Due to Charge Sharing,"Physics Symposium, 2007. Proceedings. 45th Annual. IEEE International, Volume, Issue, pp:306-311, 15-19 April 2007
  7. Vilas Sridharan, Hossein Asadi, Mehdi B. Tahoori,David Kaeli, "Reducing Data Cache Susceptibility to Soft Errors,"Dependable and Secure Computing, IEEE Transactions, Volume 3, Issue 4, pp:353-364, Oct.-Dec. 2006 https://doi.org/10.1109/TDSC.2006.55
  8. L. Lantz II,"Soft errors induced by alpha particles," Reliability, IEEE Transactions on Volume 45, Issue 2, pp. 174-179, June 1996 https://doi.org/10.1109/24.510798
  9. T. Calin, et al., "Topology-Related Upset Mechanisms in Design Hardened Storage Cells," Radiation and Its Effects on Components and Systems, 1997, RADECS 97
  10. G. S. Sohi, "Cache Memory Organization to Enhance the Yield of High-Performance VLSI Processors," IEEE Trans. Comp., Vol. 38, No. 4, pp. 484-492, April 1989 https://doi.org/10.1109/12.21141
  11. P. P. Shirvani, P.P., E.J. McCluskey, "PADded cache: a new fault-tolerance technique for cache memories," VLSI Test Symposium, 1999. Proceedings. 17th IEEE, pp. 440-445, 25-29 April 1999
  12. S. J. Walsh, J. A. Board, "Pollution control caching" Computer Design: VLSI in Computers and Processors,"1995. ICCD '95. Proceedings., 1995 IEEE International Conference, pp:300-306, 2-4 Oct. 1995
  13. S. Ozdemir, D. Sinha, G. Memik, J. Adams, H. Zhou, "Yield-Aware Cache Architectures Microarchitecture," 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium, pp:15-25, Dec. 2006
  14. Hsin-Chuan Chen, Jen-Shiun Chiang "Design of an adjustable-way set-associative cache," Communications, Computers and signal Processing, 2001. PACRIM. 2001 IEEE Pacific Rim Conference, Volume 1,pp:315-318 vol.1, 26-28 Aug. 2001
  15. C. W. Slayman, "Cache and memory error detection, correction, and reduction techniques for terrestrial servers and workstations,"Device and Materials Reliability, IEEE Transactions, Volume 5, Issue 3, pp: 397-404, Sept. 2005 https://doi.org/10.1109/TDMR.2005.856487
  16. Richard Phelan, "Addressing Soft Errors in ARM Core-based Designs, Improving fault tolerance through error detection and correction," Whitepaper, www.arm.com, December 2003
  17. A. J. smith, "Cache memories," ACM Comput. Surveys, Vol. 14, pp: 473-530, Sept. 1982 https://doi.org/10.1145/356887.356892
  18. S. S. Mukherjee, J. Emer, S. K. Reinhardt, "The soft error problem: an architectural perspective,"High-Performance Computer Architecture, 2005. HPCA-11. 11th International Symposium, Volume, Issue, pp: 243247, 12-16 Feb. 2005
  19. D. A. Patterson, et al., "Architecture of a VLSI Cache for a RISC," Proc. Int'l Symp. Comp. Architecture, Vol. 11, No. 3, pp. 108-116, June 1983
  20. X. Luo, and J.C. Muzio, "A Fault-Tolerant Multiprocessor Cache Memory," Proc. IEEE Workshop on Memory Technology, Design and Testing, pp. 52-57, August 1994
  21. M. Mehrara, M. Attariyan, S. Shyam, K. Constantinides, V. Bertacco, T. Austin,"Low-Cost Protection for SER Upsets and Silicon Defects," Design, Automation & Test in Europe Conference & Exhibition, 2007. DATE '07, pp. 1-6, 16-20 April 2007
  22. Allan. H. Johnston, "Scaling and Technology Issues for Soft Error Rates," 4th Annual Research Conference on Reliability, pp. 1-9, Stanford University, October 2000
  23. Nicholas J. Wang, Sanjay J. Patel, "ReStore: Symptom-Based Soft Error Detection in Microprocessors," Dependable and Secure Computing, IEEE Transactions on, Volume 3, Issue 3, pp. 188 201, July-Sept. 2006
  24. P. M. Carter, B. R. Wilkins, "Influences on soft error rates in static RAMs," Solid-State Circuits, IEEE Journal of Volume 22, Issue 3, pp. 430-436, Jun 1987 https://doi.org/10.1109/JSSC.1987.1052743
  25. S. S. Mukherjee, J. Emer, T. Fossum, S. K. Reinhardt,"Cache scrubbing in microprocessors: myth or necessity?," Dependable Computing, 2004 Proceedings. 10th IEEE Pacific Rim International Symposium, pp:37-42, 3-5 March 2004
  26. S. G. Miremadi, H. R. Zarandi, "Reliability of protecting techniques used in fault-tolerant cache memories," Electrical and Computer Engineering, 2005. Canadian Conference, pp:820-823, 1-4 May 2005