DOI QR코드

DOI QR Code

컴파일러에 의한 C레벨 에러 체크

Compiler triggered C level error check

  • 정지문 (서울대학교 전기컴퓨터공학부) ;
  • 윤종희 (강릉원주대학교 컴퓨터공학과) ;
  • 이종원 (서울대학교 전기컴퓨터공학부) ;
  • 백윤흥 (서울대학교 전기컴퓨터공학부)
  • 투고 : 2011.02.16
  • 심사 : 2011.04.06
  • 발행 : 2011.06.30

초록

IR(Intermediate Representation) 최적화 과정은 컴파일러 back-end의 중요한 부분으로서 sub-expression elimination, dead code elimination 등 최적화 기법들을 사용한다. 하지만 IR 최적화 단계에서 생기는 에러들을 검출하고 디버깅하는데 많은 어려움이 있다. 그 첫 번째 이유로는 컴파일 된 어셈블리 코드를 해독하여 에러를 체크하기 어렵고 두 번째로는 IR 최적화 단계에서 에러가 생겼는지 결정 짓기 어렵기 때문이다. 이런 이유들로 인하여, 우리는 C 레벨에서 IR 코드변환 무결점 여부를 체크하기 위한 기법들에 관한 연구를 진행하여 왔다. 우리는 MeCC(Memory Comparison-based Clone) 탐색기를 기반으로 하여, 최적화하기 전 IR코드와 최적화 한 후의 IR코드를 각각 C코드로 다시 변환한 뒤, 이 두 개의 C코드를 MeCC의 입력으로 주고, 결과의 일치 여부를 확인하는 방법을 사용한다. 하지만 MeCC가 완벽한 결과를 알려주지 않기 때문에, 우리는 각 IR 최적화 기법마다의 특징에 대한 정보를 사전에 처리해서 그 결과의 정확도를 높였다. 이 논문에서는 dead code elimination, instruction scheduling 및 common sub-expression elimination 등 최적화 기법들을 이용한 변환 코드들을 예시로 실험하여 최종적으로 MeCC에서의 C 레벨 코드의 정확한 에러 체크 동작여부를 보여준다.

We describe a technique for automatically proving compiler optimizations sound, meaning that their transformations are always semantics-preserving. As is well known, IR (Intermediate Representation) optimization is an important step in a compiler backend. But unfortunately, it is difficult to detect and debug the IR optimization errors for compiler developers. So, we introduce a C level error check system for detecting the correctness of these IR transformation techniques. In our system, we first create an IR-to-C converter to translate IR to C code before and after each compiler optimization phase, respectively, since our technique is based on the Memory Comparison-based Clone(MeCC) detector which is a tool of detecting semantic equivalency in C level. MeCC accepts only C codes as its input and it uses a path-sensitive semantic-based static analyzer to estimate the memory states at exit point of each procedure, and compares memory states to determine whether the procedures are equal or not. But MeCC cannot guarantee two semantic-equivalency codes always have 100% similarity or two codes with different semantics does not get the result of 100% similarity. To increase the reliability of the results, we describe a technique which comprises how to generate C codes in IR-to-C transformation phase and how to send the optimization information to MeCC to avoid the occurrence of these unexpected problems. Our methodology is illustrated by three familiar optimizations, dead code elimination, instruction scheduling and common sub-expression elimination and our experimental results show that the C level error check system is highly reliable.

키워드

참고문헌

  1. A. Pnueli, M. Siegel, and E. Singerman, Translation validation, In Tools and Algorithms for Construction and Analysis of Systems, TACAS '98, volume 1384 of Lecture Notes in Computer Science, pages151-166, 1998. https://doi.org/10.1007/BFb0054170
  2. George C. Necula, Translation validation for an optimizing compiler, In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 83-95, Vancouver, Canada, June, 2000.
  3. Martin Rinard, Credible compilation, Technical Report MIT-LCS-TR-776, Massachusetts Institute of Technology, March, 1999.
  4. J. Guttman, J. Ramsdell, and M. Wand, VLISP: a verified implementation of Scheme, Lisp and Symbolic Compucation, 8(1-2):33-110, 1995. https://doi.org/10.1007/BF01128407
  5. F. Lockwood Morris, Advice on structuring compilers and proving them correct, In Conference Record of the 1st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Boston MA, January, 1973.
  6. Jens Knoop, Oliver Ruthing, and Bernhard Steffen, Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-1155, July, 1994. https://doi.org/10.1145/183432.183443
  7. M. Kauffmann and R.S. Boyer, The Boyer-Moore theorem prover and its interactive enhancement, Computers and Mathematics with Applications, 29(2):27-62, 1995. https://doi.org/10.1016/0898-1221(94)00215-7
  8. Bernhard Steffen, Generating dataflow analysis algorithms for model specifications, Science of Computer Programming, 21(2):115-139, 1993. https://doi.org/10.1016/0167-6423(93)90003-8
  9. Heejung Kim, Yungbum Jung, Sunghun Kim, Kwangkeun Yi, MeCC: Memory Comparison-based Clone Detector, ICSE 2011: The 33rd International Conference on Software Engineering, Waikiki, Honolulu, Hawaii, May 21 - 28, 2011.
  10. M. Ahn, SoarGen: A user retargetable compiler in the design of embedded systems, Ph.D thesis, Seoul National University, 2009.