DOI QR코드

DOI QR Code

Efficient Similarity Analysis Methods for Same Open Source Functions in Different Versions

서로 다른 버전의 동일 오픈소스 함수 간 효율적인 유사도 분석 기법

  • 김영철 (충남대학교 컴퓨터공학과) ;
  • 조은선 (충남대학교 컴퓨터공학과)
  • Received : 2017.01.22
  • Accepted : 2017.06.27
  • Published : 2017.10.15

Abstract

Binary similarity analysis is used in vulnerability analysis, malicious code analysis, and plagiarism detection. Proving that a function is equal to a well-known safe functions of different versions through similarity analysis can help to improve the efficiency of the binary code analysis of malicious behavior as well as the efficiency of vulnerability analysis. However, few studies have been carried out on similarity analysis of the same function of different versions. In this paper, we analyze the similarity of function units through various methods based on extractable function information from binary code, and find a way to analyze efficiently with less time. In particular, we perform a comparative analysis of the different versions of the OpenSSL library to determine the way in which similar functions are detected even when the versions differ.

바이너리 유사도 분석은 취약점 분석, 악성코드 분석, 표절 탐지 등에서 사용되고 있는데, 분석대상 함수가 알려진 안전한 함수와 동일하다는 것을 증명해주면 바이너리 코드의 악성행위 분석, 취약점 분석 등의 효율성을 높이는 데에 도움이 될 수 있다. 하지만 기존에는 동일 함수의 서로 다른 버전에 대한 유사도 분석에 대해서 별도로 이루어진 연구가 거의 없었다. 본 논문에서는 바이너리로부터 추출 가능한 함수 정보들을 바탕으로 다양한 방법을 통해 함수 단위의 유사도를 분석하고 적은 시간으로 효율적으로 분석할 수 있는 방안을 모색한다. 특히 OpenSSL 라이브러리의 서로 다른 버전을 대상으로 분석을 수행하여 버전이 다른 경우에도 유사한 함수를 탐지하는 것을 확인한다.

Keywords

Acknowledgement

Supported by : 정보통신기술진흥센터, 국가 보안 기술 연구소

References

  1. L. Luo, J. Ming, D. Wu, P. Liu, and S. Zhu, "Semantics-Based Obfuscation-Resilient Binary Code Similarity Comparsion with Applications to Software Plagiarism Detection," Proc. of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 389-400, Nov. 2014.
  2. X. Hu, T. Chiueh, and K. G. Shin, "Large-Scale Malware Indexing Using Function-Call Graphs," Proc. of the 2009 ACM Conference on Computer and Communications Security, pp. 611-620 Nov. 2009.
  3. M. Egele, M. Woo, P. Chapman, and D. Brumley, "Blanket Execution: Dynamic Similarity Testing for Program Binaries and Components," Proc. of the 23rd USENIX Security Symposium, pp. 303-317, Aug. 2014.
  4. J. H. Park, Y. S. Choi, and J. M. Choi, "Software Similarity Analysis via Stack Usage Pattern," Journal of KIISE : Computing Practices and Letters, Vol. 20, No. 6, pp. 349-353, Jun. 2014. (in Korean)
  5. Y. David, N. Partush, and E. Yahav, "Statistical Similarity of Binaries," Proc. of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 266-280, Jun. 2016.
  6. Hex-Rays (2005), IDA F.L.I.R.T. Technology: In-Depth [Online] Available https://www.hex-rays.com/products/ida/tech/flirt/in_depth.shtml
  7. H. Park, S. Choi, S. Seo, and T. Han, "Analyzing Differences of Binary Executable Files using Program Structure and Constant Values," Journal of KIISE : Software and Applications, Vol. 35, No. 7, pp. 452-461, Sep. 2008. (in Korean)
  8. D. Kim, and S. Cho, "Detection of Open-Source Software Module based on Function-level Features," Journal of KIISE, Vol. 42, No. 6, pp. 713-722, Jun, 2015. (in Korean) https://doi.org/10.5626/JOK.2015.42.6.713
  9. Y. Kim, E.-S. Cho, "Similarity Analysis on Different Versions of Same Functions," Proc. of KIISE Korea Computer Congress 2016, pp. 760-762, Dec. 2016. (in Korean)
  10. J.-C. Choi, and S.-J. Cho, "Open Source Software Detection based on Opcde k-gram at Binary Code Level," Journal of KIISE : Computer Systems and Theory, Vol. 41, No. 1, pp. 23-32, Feb. 2014. (in Korean)
  11. P. P.F. Chan, and C. Collberg, "A Method to Evaluate CFG Comparison Algorithms," 2014 14th International Conference on Quality Software (QSIC), pp. 95-104, 2014.
  12. T. Bao, J. Burket, M. Woo, R. Turner, and D. Brumley, "ByteWeight: Leraning to Recognize Functions in Binary Code," Proc. of the 23rd USENIX Security Symposium, pp. 845-860, Aug. 2014.
  13. J. Makhoul, F. Kubala, R. Schwartz, and R. Weischedel, "Performance measures for information extraction," DARPA Broadcast News Workshop, 1999.