DOI QR코드

DOI QR Code

On-the-fly Atomicity Violation Repairing Technique for Airborne Health Management Systems

항공기 건전성 관리시스템용 원자성 위배 자율 수리 소프트웨어 기법

  • Choi, Eu-Teum (Department of Aerospace Software Engineering, Gyeongsang National University) ;
  • Lee, Dong-Su (Korea Aerospace Industries, LTD.) ;
  • Jun, Yong-Kee (Department of Aerospace Software Engineering, Gyeongsang National University) ;
  • Lee, Seongjin (Department of Aerospace Software Engineering, Gyeongsang National University)
  • Received : 2020.03.26
  • Accepted : 2020.06.16
  • Published : 2020.07.01

Abstract

Airborne health management system prevents functional failure caused by errors or faults in the airborne software. On-the-fly repairing atomicity violations (AV) in an ARINC-653 concurrent software is critical for guaranteeing correctness of execution of the software. This paper proposes Repairing-AV which efficiently repairs atomicity violations. The Repairing-AV can diagnose and prevent an error on-the-fly by utilizing the training results of the software and controls access to the shared variable of the thread where the error occurred. The evaluation of the Repairing-AV measures the time overhead by applying the previous work and the Repairing-AV to five synthesis programs containing the atomicity violation. As the result of evaluation, the RepairingAV constantly shows about 1.4x time overhead regardless of count of shared variable access.

항공기 건전성 관리시스템은 항공기 소프트웨어에서 발생한 오류 또는 결함으로 인해 항공기의 기능이 실패되는 것을 방지한다. ARINC-653의 병행프로그램에서 발생하는 원자성 위배의 자율 수리는 프로그램의 정상적인 실행을 보장하기 때문에 중요하다. 본 논문은 프로그램 실험 결과를 활용하여 수행 중에 원자성 위배를 예측하고 주요 관련 접근 사건을 지연시켜 수리하는 기법인 Repairing-AV를 제시한다. 실세계 소프트웨어에서 발생한 5가지 원자성 위배 패턴을 포함하는 합성 프로그램에 기존 기법과 Repairing-AV을 적용하여 수리 시간 오버헤드를 비교하였다. 실험 결과 Repairing-AV는 공유변수 접근 횟수와 관계없이 평균 1.4배의 일정한 시간 오버헤드를 가짐을 확인하였다.

Keywords

References

  1. Merendino, T., Latimer, IV, D. T., Hammons, C. B., Falkenthal, D., Capell, P. and Firesmith, D. G., The Method Framework for Engineering System Architectures, CRC Press, 2008.
  2. Netzer, R. H. and Miller, B. P., "What Are Race Conditions?," ACM Letters on Programming Languages and Systems (LOPLAS), March 1992, pp. 74-88.
  3. Lu, S., Park, S. Y., Seo, E. S. and Zhou, Y., "Learning from Mistakes A Comprehensive Study on Real World Concurrency Bug Characteristics," ACM SIGOPS Operating Systems Review, March 2008, pp. 329-339.
  4. Dinning, A. and Schonberg, E., "Detecting Access Anomalies in Programs with Critical Sections," Proceedings of the 1991 ACM/ONR workshop on Parallel and Distributed Debugging, December 1991, pp. 85-96.
  5. Jun, Y.-K. and Koh, K., "On-the-fly Detection of Access Anomalies in Nested Parallel Loops," Proceedings of the 1993 ACM/ONR workshop on Parallel and Distributed Debugging, December 1993, pp. 107-117.
  6. Ha, O.-K., Kuh, I.-B., Tchamgoue, G. M. and Jun, Y.-K., "On-the-fly Detection of Data Races in OpenMP Programs," Proceedings of the 10th Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging (PADTAD 2012), July 2012., pp. 1-10.
  7. Ratanaworabhan, P., Burtscher, M., Kirovski, D., Zorn, B., Nagpal, R. and Pattabiraman, K., "Detecting and tolerating asymmetric races," ACM SIGPLAN Notices, February 2009, pp. 173-184.
  8. Lucia, B. and Ceze, L., "Cooperative empirical failure avoidance for multithreaded programs," ACM Special Interest Group on Programming Languages Notices, March 2013, pp. 39-50.
  9. Mahadevan, N., Dubey, A. and Karsai, G., "Application of software health management techniques," Proceedings of the 6th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, May 2011, pp. 1-10.
  10. Srivastava, A. N. and Schumann, J., "The case for software health management," 2011 IEEE Fourth International Conference on Space Mission Challenges for Information Technology, September 2011 , pp. 3-9.
  11. Goldberg, A. and Horvath, G., "Software Fault Protection with ARINC 653," 2007 IEEE Aerospace Conference, 2007 IEEE, March 2007, pp. 1-11.
  12. Ofsthun, S., "Integrated vehicle health management for aerospace platforms," IEEE instrumentation & measurement magazine, Vol. 5, No 3, September 2002, pp. 21-24. https://doi.org/10.1109/MIM.2002.1028368
  13. Scandura J. P., "7. Vehicle Health Management Systems," Digital avionics handbook, John Wiley & Sons, 2015, pp. 1-24.
  14. Pullum, L. L., Software fault tolerance techniques and implementation, Artech House, 2001, pp. 1-53.
  15. Airlines electronic engineering committee (AEEC)., Avionics application software standard interface - ARINC Specification 653 - Part 1. (supplement 2 - required services), ARINC Inc. 2015.
  16. Ha, O. K., Tchamgoue, G. M., Suh, J. B. and Jun, Y. K., "On-the-fly healing of race conditions in ARINC-653 flight software," Digital Avionics Systems Conference (DASC), 2010 IEEE/AIAA 29th, October 2010, pp. 5.A.6-1-5.A.6-11.
  17. Tchamgoue, G. M., Ha, O. K., Kim, K. H. and Jun, Y. K., "A framework for on-the-fly race healing in ARINC-653 applications," International Journal of Hybrid Information Technology, SERSC, April 2011, pp. 1-12.
  18. United State Department of Defense, "Appendix E. Generic Software Safety Requirements and Guidelines," Joint Software Systems Safety Engineering Handbook, August 2010, pp. E-15-E-18.
  19. Dang, H.-V., Snir, M. and Gropp, W., "Towards millions of communicating threads," Proceedings of the 23rd European MPI Users' Group Meeting, September 2016, pp. 1-14.
  20. Ha, O.-K. and Jun, Y.-K., "An Efficient Algorithm for On-the-Fly Data Race Detection Using an Epoch-Based Technique," Scientific Programming, Vol. 2015, 205827, 2015, pp. 1-14
  21. Sridharan, S., Gupta, G. and Sohi, G. S., "Adaptive, Efficient, Parallel Execution of Parallel Programs," Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2014, pp. 169-180.
  22. Zhang, M., Wu, Y., Lu, S., Qi, S., Ren, J. and Zheng, W., "A Lightweight System for Detecting and Tolerating Concurrency Bugs," IEEE Transactions on Software Engineering, October 2016, pp. 899-917. https://doi.org/10.1109/TSE.2016.2531666
  23. Oracle Corporation and/or its affiliates, MySQL Bugs. Available: http://bugs.mysql.com/, 2020.
  24. Mozilla and Individual Contributors, https://bugzilla.mozilla.org/, 2020.
  25. The Apache Software Foundation, https://httpd.apache.org/bug_report.html, 2020.