On-the-fly Detection of Race Conditions in Message-Passing Programs

메시지 전달 프로그램에서의 수행 중 경합탐지

  • 박미영 (전남대학교 유비쿼터스 정보가전 사업단) ;
  • 강문혜 (경상대학교 정보과학과) ;
  • 전용기 (경상대학교 정보과학과) ;
  • 박혁로 (전남대학교 전산과)
  • Published : 2007.08.15

Abstract

Message races should be detected for debugging message-passing parallel programs because they can cause non-deterministic executions. Specially, it is important to detect the first race in each process because the first race can cause the occurrence of the other races in the same process. The previous techniques for detecting the first races require more than two monitored runs of a program or analyze a trace file which size is proportional to the number of messages. In this paper we introduce an on-the-fly technique to detect the first race in each process without generating any trace file. In the experiment we test the accuracy of our technique with some benchmark programs and it shows that our technique detects the first race in each process in all benchmark programs.

메시지전달 프로그램에서 발생하는 메시지경합은 프로그램의 비결정적 수행결과를 초래하므로 효과적인 디버깅을 위하여 탐지되어야 한다. 특히 각 프로세스에서 가장 먼저 발생하는 최초경합은 동일한 프로세스 내에서 다른 경합의 발생을 초래할 수 있으므로, 효과적인 경합탐지를 위해서 우선적으로 탐지되어야 한다. 이러한 경합을 탐지하기 위한 기존의 기법들은 적어도 프로그램을 두 번 수행하거나, 메시지들의 수에 비례한 크기의 추적 파일의 분석을 요구한다. 본 논문은 추적파일을 생성하지 않으면서 단 한번의 프로그램 수행으로 각 프로세스에서 발생하는 최초경합을 탐지하는 기법을 제시하고, 실험을 통해서 본 기법이 최초경합을 정확히 탐지함을 보인다.

Keywords

References

  1. Cypher, R, and E. Leu, 'The Semantics of Blocking and Nonblocking Send and Receive Primitives,' 8th Intl. Parallel Processing Symp., pp. 729-735, IEEE, April 1994
  2. Cypher, R, and E. Leu, 'Efficient Race Detection for Message-Passing Programs with Nonblocking Sends and Receives,' 7th Symp. on Parallel and Distributed Processing, pp. 534-541, IEEE, Oct. 1995
  3. Fidge, C. J. 'Partial Orders for Parallel Debugging,' SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, pp. 183-194, ACM, May 1988
  4. Geist, A., A. Beguelin, J. Dongarra, W. Jiang, R Manchek, and V. Sunderam, 'PVM: Parallel Virtual Machine,' A Users' Guide and Tutorial for NetworkedParallel Computing, Cambridge, MIT Press, 1994
  5. Lamport, L., 'Time, Clocks, and the Ordering of Events in a Distributed System,' Communications of the ACM, 21(7): 558-565, ACM, July 1978 https://doi.org/10.1145/359545.359563
  6. Snir, M., S. Otto, S. Huss- Lederman, D. Walker, and J. Dongarra, MPI: The Complete Reference, MIT Press, 1996
  7. Kranzluller, D., and M. Schulz, 'Notes on Nondeterminism in Message Passing Programs,' 9th European PVM/MPI Users' Group Conf. Lecture Notes In Computer Science, 2474: 357-367, Springer- Verlag, Sept. 2002
  8. Damodaran- Kamal, S. K, and J. M. Francioni, 'Testing Races in Parallel Programs with an OtOt Strategy,' Int'l Symp. on Software Testing and Analysis, pp. 216-227, ACM, Aug. 1994
  9. Kilgore, R, and C. Chase, 'Re-execution of Distributed Programs to Detect Bugs Hidden by Racing Messages,' 30th Annual Hawaii Int'l. Conf. on System Sciences, Vol. 1, pp. 423-432, Jan. 1997
  10. Netzer, R. H. B., T. W. Brennan, and S. K. Damodaran- Kamal, 'Debugging Race Conditions in Message-Passing Programs,' SIGMETRICS Symp. on Parallel and Distributed Tools, pp. 31-40, ACM, May 1996
  11. Netzer, R. H. B., and B. P. Miller, 'Optimal Tracing and Replay for Debugging Message-Passing Parallel Programs,' Supercomputing, pp. 502-511, IEEE/ACM, Nov. 1992
  12. Park, M., and Y. Jun, 'Detecting Unaffected Race Conditions in Message-Passing Programs,' 11th European PVM/MPI User's Group Meeting, Lecture Notes in Computer Science, 3241: 268-276, Springer-Verlag, Sept. 2004
  13. Tai, K. C., 'Race Analysis of Traces of Asynchronous Message-Passing Programs,' 17th Int'l. Conf. Distributed Computing Systems, pp. 261-268, IEEE, May 1997
  14. Gropp, W., and E. Lusk, 'Reproducible Measurements of MPI Performance Characteristics,' 6th European PVM/MPI Users' Group Cornf. Lecture Notes in Computer Science, 1697: 11-18, Springer-Verlag, Sept. 1999
  15. HPC Group, MPI Run Time Error Detection Test Suites: http://rted.public.iastate.edu/MPI/, Iowa State University, USA, 2006
  16. Park, Mi - Young, and Yong- Kee Jun, 'Detecting Unaffected Message Races in Parallel Programs,' Proc. of the 1st Int 'I Conf, on Grid and Pervasive Computing (GPC), Lecture Notes in Computer Science, 3947: 187-196, Springer-Verlag, Taichung, Taiwan, May 2006
  17. Claudio, AP., J,D. Cunha, and M.B. Carmo, 'Monitoring and Debugging Message Passing Applications with MPVisualizer,' 8th Euromicro Workshop on Parallel and Distributed Processing, pp. 376-382, IEEE, Jan. 2000
  18. Krammer, B., M.S. Muller, and M.M. Resch, 'MPI Application Development Using the Analysis Tool MARMOT,' 4th International Conference on Computational Science, Lecture Notes in Computer Science, 3038: 464-471, Springer-Verlag, June 2004
  19. Kranzlmuller D., C. Schaub schlager, and J. Volkert, 'A Brief Overview of the MAD Debugging Activities,' 4th International Workshop on Automated Debugging (AADEBUG 2000), Aug. 2000