DOI QR코드

DOI QR Code

악성코드 분류를 위한 중요 연산부호 선택 및 그 유용성에 관한 연구

A Study on Selecting Key Opcodes for Malware Classification and Its Usefulness

  • 박정빈 (한양대학교 컴퓨터소프트웨어학과) ;
  • 한경수 (한양대학교 컴퓨터소프트웨어학과) ;
  • 김태근 (한양대학교 컴퓨터소프트웨어학과) ;
  • 임을규 (한양대학교 컴퓨터공학부)
  • 투고 : 2014.07.21
  • 심사 : 2015.01.23
  • 발행 : 2015.05.15

초록

최근 새롭게 제작되는 악성코드 수의 증가와 악성코드 변종들의 다양성은 악성코드 분석가의 분석에 소요되는 시간과 노력에 많은 영향을 준다. 따라서 효과적인 악성코드 분류는 악성코드 분석가의 악성코드 분석에 소요되는 시간과 노력을 감소시키는 데 도움을 줄 뿐만 아니라, 악성코드 계보 연구 등 다양한 분야에 활용 가능하다. 본 논문에서는 악성코드 분류를 위해 중요 연산부호를 이용하는 방법을 제안한다. 중요 연산부호란 악성코드 분류에 높은 영향력을 가지는 연산부호들을 의미한다. 실험을 통해서 악성코드 분류에 높은 영향력을 가지는 상위 10개의 연산부호들을 중요 연산부호로 선정할 수 있음을 확인하였으며, 이를 이용할 경우 지도학습 알고리즘의 학습시간을 약 91% 단축시킬 수 있었다. 이는 향후 다량의 악성코드 분류 연구에 응용 가능할 것으로 기대된다.

Recently, the number of new malware and malware variants has dramatically increased. As a result, the time for analyzing malware and the efforts of malware analyzers have also increased. Therefore, malware classification helps malware analyzers decrease the overhead of malware analysis, and the classification is useful in studying the malware's genealogy. In this paper, we proposed a set of key opcode to classify the malware. In our experiments, we selected the top 10-opcode as key opcode, and the key opcode decreased the training time of a Supervised learning algorithm by 91% with preserving classification accuracy.

키워드

과제정보

연구 과제 주관 기관 : 정보통신산업진흥원

참고문헌

  1. Malware statistics, Available: http://www.av-test.org/en/statistics/malware/
  2. E. Gandotra, D. Bansal, S. Sofat, "Malware Analysis and Classification: A Survey," Journal of Information Security 2014, Vol. 5, No. 2, pp. 9, 2014.
  3. Y. Park, D. Reeves, V. Mulukutla, B. Sundaravel, "Fast malware classification by automated behavioral graph matching," Proc. of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research, pp. 45, 2010.
  4. S. Cesare, Y. Xiang, "A fast flowgraph based classification system for packed and polymorphic malware on the endhost," Proc. of Advanced Information Networking and Applications, pp. 721-728, 2010.
  5. K. Raman, "Selecting features to classify malware," Proc. of InfoSec Southwest, 2012.
  6. Q. Jiang, X. Zhao, K. Huang, "A feature selection method for malware detection," Proc. of 2011 IEEE International Conference on Information and Automation, pp. 890-895, 2011.
  7. G. E. Dahl, J. W. Stokex, L. Deng, D. Yu, "Largescale malware classification using random projections and neural networks," Proc. of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3422-3426. 2013.
  8. R. Tian, L. Batten, R. Islam, S. Versteeg, "An automated classification system based on the strings of trojan and virus families," Proc. of 2009 4th International Conference on Malicious and Unwanted Software, pp. 23-30, 2009.
  9. K. Rieck, T. Holz, C. Willems, P. Düssel, P. Laskov, "Learning and classification of malware behavior," Proc. of 5th International Conference, Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 108-125, 2008.
  10. D. Bilar, "Opcodes as predictor for malware," Journal of Electronic Security and Digital Forensics, Vol. 1, No. 2, pp. 156-168, 2007. https://doi.org/10.1504/IJESDF.2007.016865
  11. K. S. Han, B. Kang, E. G. Im, "Malware classification using instruction frequencies," Proc. of the 2011 ACM Symposium on Research in Applied Computation, pp. 298-300, 2011.
  12. P. O'Kane, S. Sezer, K. McLaughlin, E. G. Im, "SVM Training Phase Reduction Using Dataset Feature Filtering for Malware Detection," Journal of IEEE transactions on information forensics and security, Vol. 8, No. 3, pp. 500-509, Mar. 2013. https://doi.org/10.1109/TIFS.2013.2242890
  13. B. B. Rad, M. Masrom, S. Ibrahim, "Opcodes histogram for classifying metamorphic portable executables malware," Proc. of 2012 International Conference on e-Learning and e-Technologies in Education, pp. 209-213, 2012.
  14. M. Alazab, M. A. Kadiri, S. Venkatraman, A. Al-Nemrat, "Malicious Code Detection Using Penalized Splines on OPcode Frequency," Proc. of 2012 Third Cybercrime and Trustworthy Computing Workshop, pp. 38-47, 2012.
  15. PE format, Available: http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx
  16. I. Santos, F. Brezo, J. Nieves, Y. K. Penya, B. Sanz, C. Laorden, P. G. Bringas, "Idea: Opcode-sequencebased malware detection," Proc. of Second International Symposium on Engineering Secure Software and Systems, pp. 35-43, 2010.
  17. I. Guyon, J. Weston, S. Barnhill, V. Vapnik, "Gene selection for cancer classification using support vector machines," Journal of Machine learning, Vol. 46, No. 1-3, pp. 389-422, Fab. 2002. https://doi.org/10.1023/A:1012487302797
  18. Weka, Available: http://www.cs.waikato.ac.nz/ml/weka/
  19. VXheaven, Available: http://vxheavens.com/
  20. K. Rieck, T. Holz, C. Willems, P. Dussel, P. Laskov, "Learning and classification of malware behavior," Proc. of 5th International Conference, Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 108-125, 2008.
  21. M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, J. Nazario, "Automated classification and analysis of internet malware," Proc. of 10th International Symposium, Recent Advances in Intrusion Detection, pp. 178-197, 2007.
  22. M. Z. Shafiq, S. M. Tabish, M. Farooq, "Are evolutionary rule learning algorithms appropriate for malware detection?" Proc. of the 11th Annual conference on Genetic and evolutionary computation, pp. 1915-1916, 2009.
  23. R. Tian, L. M.Batten, S. C. Versteeg, "Function length as a tool for malware classification," Proc. of 2008 3rd International Conference on Malicious and Unwanted Software, pp. 69-76, 2008.
  24. V. Moonsamy, R. Tian, L. Batten, "Feature reduction to speed up malware classification," Proc. of the 16th Nordic Conference on Information Security Technology for Applications, ser. NordSec'11, pp. 176-188, 2012.