Advanced SearchSearch Tips
A Fault-Tolerant Scheme Based on Message Passing for Mission-Critical Computers
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
A Fault-Tolerant Scheme Based on Message Passing for Mission-Critical Computers
Kim, Taehyon; Bae, Jungil; Shin, Jinbeom; Cho, Kilseok;
  PDF(new window)
Fault tolerance is a crucial design for a mission-critical computer such as engagement control computer that has to maintain its operation for long mission time. In recent years, software fault-tolerant design is becoming important in terms of cost-effectiveness and high-efficiency. In this paper, we propose MPCMCC which is a model-based software component to implement fault tolerance in mission-critical computers. MPCMCC is a fault tolerance design that synchronizes shared data between two computers by using the one-way message-passing scheme which is easy to use and more stable than the shared memory scheme. In addition, MPCMCC can be easily reused for future work by employing the model based development methodology. We verified the functions of the software component and analyzed its performance in the simulation environment by using two mission-critical computers. The results show that MPCMCC is a suitable software component for fault tolerance in mission-critical computers.
Engagement Control System;Fault-Tolerant Computer;Message Passing Synchronization;Model-Based Development;
 Cited by
J. Shin, S. Kim, "Reliability Analysis of The Mission-Critical Engagement Control Computer Using Active Sparing Redundancy," The KIPS Transactions: Part A, Vol. 15, No. 6, pp. 309-316, 2008.

R. E. Lyons, W. Vanderkulk, "The Use of Triple- Modular Redundancy to Improve Computer Reliability," IBM Journal of Research and Development, Vol. 6, No. 2, pp. 200-209, 1962. crossref(new window)

J. Gray, D. P. Siewiorek, "High-Availability Computer Systems," Computer, Vol. 24, No. 9, pp. 39-48, 1991.

A. Avizienis, "The N-Version Approach to Fault- Tolerant Software," Software Engineering, IEEE Transactions on, Vol. SE-11, No. 12, pp. 1491-1501, 1985. crossref(new window)

K. Shin, Y. Lee, "Evaluation of Error Recovery Blocks Used for Cooperating Processes," Software Engineering, IEEE Transactions on, Vol. SE-10, No. 6, pp. 692-700, 1984. crossref(new window)

I. Lee, R. K. Iyer, "Software Dependability in the Tandem GUARDIAN System," Software Engineering, IEEE Transactions on, Vol. 21, No. 5, pp. 455-467, 1995. crossref(new window)

D. Song, C. Lee, "An Implementation of Fault- Tolerant Message Passing Interface on Parallel Computers," Journal of KIISE : Computing Practices and Letters, Vol. 6, No. 3, pp. 319-328, 2000.

M. Yoo, et. al., "Development of the Engagement Control Software Architecture Based on UML 2.0 Model," Journal of the Korea Institute of Military Science and Technology, Vol. 10, No. 4, pp. 20-29, 2007.

B. Rajappa, Y. Motiwala, "Message Based Redundancy Approach using Totem Protocol for Telecom Applications and Protocol Stacks," Communication Systems Software and Middleware, 2nd International Conference on, pp. 1-6, Jan. 2007.

R. Batchu, et. al., "MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message- Passing Middleware," Cluster Computing, Vol. 7, No. 4, pp. 303-315, 2004. crossref(new window)