Switchover Time Analysis of Primary-Backup Server Systems Based on Software Rejuvenation

소프트웨어 재활기법에 기반한 주-여분 서버 시스템의 작업전이 시간 분석

  • Lee, Jae-Sung (Information Communication, Graduate School of Ajou University) ;
  • Park, Kie-Jin (Electronics and Telecommunications Research Institute) ;
  • Kim, Sung-Soo (Information Communication, Graduate School of Ajou University)
  • 이재성 (아주대학교 정보통신전문대학원) ;
  • 박기진 (한국전자통신연구원) ;
  • 김성수 (아주대학교 정보통신전문대학원)
  • Published : 2001.06.01

Abstract

As the rapid growth of Internet, computer systems are growing in its size and complexity. To meet high availability requirements for the systems, one usually uses both hardware and software fault tolerance techniques. To prevent failures of computer systems from software-aging phenomenon that come from long mission time, we adopt software rejuvenation method that stops and restarts the software in the servers intentionally. The method makes the systems clean and healthy state in which the probability of fault occurrence is very low. In this paper, we study how switchover time affects software rejuvenation of primary-backup server systems. Through experiments, we find that switchover time is an essential factor for deciding the rejuvenation policy.

인터넷의 급속한 확산으로 인하여, 컴퓨터 시스템의 규모 및 복잡도가 점차 증가하고 있으며, 컴퓨터 시스템에 대한 높은 수준의 가용도 요구 조건을 충족시키기 위해, 하드웨어적.소프트웨어적 결함 허용 기법에 대한 연구가 활발하다. 소프트웨어 재활 기법은 서버에 탑재된 소프트웨어의 장시간 가동에 따른 소프트웨어 노화 현상을 다루고 있으며, 서버에서 실행중인 소프트웨어의 수행을 고의적으로 멈춘 후에 결함 발생 가능성이 낮은 초기 상태에서 이를 다시 가동시키는 소프트웨어적 결함 예방 방법의 일종이라 볼 수 있다. 본 연구에서는 주-여분 서버 시스템에서의 작업전이 시간이 소프트웨어 재활에 미치는 영향을 연구하였으며, 가용도 분석을 통해서 작업전이 시간이 재활 정책을 결정함에 있어서 중요한 요소임을 발견하였다.

Keywords

References

  1. 김석우, 서창호, '전자상거래 인증서비스 기술', 한국정보처리학회지, 제7권 제2호, pp.20-24, 2000
  2. 김진상, 박재희, 방갑산, 'ERP 기술개발 동향 및 추세,' 정보과학회지, 제16권 제11호, pp. 6-12, 1998
  3. R.Jain, 'The Art of Computer Systems Performance Analysis,' pp.685, John Wiley & Sons Inc. 1991
  4. J. Andreoli, F. Pacull and R. Pareschi, 'XPECT : A Frame work for Electronic Commerce,' IEEE Internet Computing, pp.40-48, June, 1997 https://doi.org/10.1109/4236.612214
  5. Z. Tian, L. Liu, J. Li, J. Chung and V. Guttemukkala, 'Business to Business E Commerce with Open Buying on the Internet,' WECWIS, pp. 56-62, April, 1999
  6. I. Lee and R. Iyer, 'Software Dependability in the Tandem GUARDIAN System,' IEEE Transactions on Software Engineering, Vol.21, No.5, pp.455-467, May, 1995 https://doi.org/10.1109/32.387474
  7. J. Gray and D. Siewiorek, 'High Availability Computer Systems,' IEEE Computer, pp.39-48, September, 1991 https://doi.org/10.1109/2.84898
  8. M. Sullivan and R. Chillarege, 'Software Defects and Their Impact on System Availability - A Study of Field Failures in Operating Systems,' IEEE International Symposium on Fault Tolerant Computing, pp.2-9, June, 1991 https://doi.org/10.1109/FTCS.1991.146625
  9. A. Pfening, S. Garg, A. Puliafito, M. Tclek and K.Trivedi, 'Optimal Rejuvenation for Tolerating Software Failures,' 27th & 28th Performance Evaluation, pp.491-506, October, 1996 https://doi.org/10.1016/0166-5316(96)00038-7
  10. Y. Wang, Y. Huang, K. Vo, P. Chung and C. Kintala, 'Checkpointing and Its Applications,' Proceedings of 25th IEEE Fault Tolerant Computing Symposium, pp.22-31, June, 1995 https://doi.org/10.1109/FTCS.1995.466999
  11. S. Garg, A.Moorsel, K. Vaidyanathan and K. Trivedi, 'A Methodology for Detection and Estimation of Software Aging,' Proceedings of 9th International Symposium on Software Reliability Engineering, pp.282-292, November, 1998 https://doi.org/10.1109/ISSRE.1998.730892
  12. J. Gray, 'Why Do Computers Stop and What Can Be Done About It?,' Proceedings of 5th Symposium on Reliability in Distributed Software and Database Systems,' pp.3-12, January, 1986
  13. B. Grey, 'Making SDI Software Reliable Through Fault Tolerant Techniques,' Defense Electronics, pp.7-80, 85-86, August, 1987
  14. E. Marshall, 'Fatal Error : How Patriot Overlooked a Scud,' Science, pp.1347, March, 1992
  15. A. Tai, S. Chau, L. Alkalaj and H. Hecht, 'On-Board Preventive Maintenace : Analysis of Effectiveness and Optimal Duty Period,' Proceedings of 3rd International Workshop on Object Oriented Real time Dependable Systems, pp.26-27, February, 1997 https://doi.org/10.1109/WORDS.1997.609924
  16. S. Garg, Y. Huang, C. Kintala and K.Trivedi, 'Time and Load Based Software Rejuvenation : Policy, Evaluation and Optimality,' Proceedings of the First Conference on Fault Tolerant Systems, pp.22-25, December, 1995
  17. S. Garg, A. Puliafito, M. Telek and K. Trivedi, 'Analysis of Software Rejuvenation Using Markov Regenerative Stochastic Petri Net,' Proceedings of the Sixth International Symposium on Software Reliability Engineering, pp.180-187, October, 1995 https://doi.org/10.1109/ISSRE.1995.497656
  18. S. Garg, A. Puliafito, M. Telek and K. Trivedi, 'On the Analysis of Software Rejuvenation Policies,' Annual Conference on Computer Assurance(COMPASS), pp.16-20, June, 1997 https://doi.org/10.1109/CMPASS.1997.613248
  19. S. Garg, A. Puliafito, M. Telek and K. Trivedi, 'Analysis of Preventive Maintenance in Transactions Based Software Systems,' IEEE Transactions on Computers, Vol.47, No.1, pp.96-107, January, 1998 https://doi.org/10.1109/12.656092
  20. Y. Hung, C. Kintala, N. Kolettis and N. Fulton, 'Software Rejuvenation : Analysis, Module and Applications,' Proceedings of the 25th International Symposium on Fault Tolerant Computing(FTCS-25), pp.381-390, June, 1995 https://doi.org/10.1109/FTCS.1995.466961
  21. S. Garg, Y. Huang, C. Kintala and K. Trivedi, 'Minimizing Completion Time of a Program by Checkpointing and Rejuvenation,' ACM SIGMETRICS Conference, pp.252-261, May, 1996 https://doi.org/10.1145/233008.233050
  22. 박기진, 김성수, 김재훈, '소프트웨어 재활 기법을 적용한 다중계 시스템의 가용도 분석,' 한국정보과학회논문지(시스템및이론), 제27권 제8호, pp.730-740, 2000
  23. 박기진, 김성수, '고가용도 Cold Standby 클러스터 시스템 성능 분석,' 한국정보과학회문지(시스템및이론), 제28권 제3.4호, pp.173-180, 2001