DOI QR코드

DOI QR Code

Classifying Malicious Web Pages by Using an Adaptive Support Vector Machine

  • Hwang, Young Sup (Dept. of Computer Science and Engineering, Sun Moon University) ;
  • Kwon, Jin Baek (Dept. of Computer Science and Engineering, Sun Moon University) ;
  • Moon, Jae Chan (Dept. of Computer Science and Software Science, Dankook University) ;
  • Cho, Seong Je (Dept. of Computer Science and Software Science, Dankook University)
  • Received : 2012.05.11
  • Accepted : 2012.08.02
  • Published : 2013.09.30

Abstract

In order to classify a web page as being benign or malicious, we designed 14 basic and 16 extended features. The basic features that we implemented were selected to represent the essential characteristics of a web page. The system heuristically combines two basic features into one extended feature in order to effectively distinguish benign and malicious pages. The support vector machine can be trained to successfully classify pages by using these features. Because more and more malicious web pages are appearing, and they change so rapidly, classifiers that are trained by old data may misclassify some new pages. To overcome this problem, we selected an adaptive support vector machine (aSVM) as a classifier. The aSVM can learn training data and can quickly learn additional training data based on the support vectors it obtained during its previous learning session. Experimental results verified that the aSVM can classify malicious web pages adaptively.

Keywords

References

  1. Peter Likarish, E. Jung, I. Jo, "Obfuscated Malicious JavaScript Detection using Classification Techniques," Proceedings of the 4th International Conference on Malicious and Unwanted Software, pp.47-54, 2009.
  2. Chih-Chung Chang and Chih-Jen Lin, LIBSVM: A Library for Support Vector Machines, 2001. Software available at http://www.csie.ntu.edu.tw/-cjlin/libsvm.
  3. Jun Yang, Rong Yan and Alex Hauptmann, "Cross-Domain Video Concept Detection using Adaptive SVMs," ACM Multimedia, pp.188-197, 2007.
  4. Y. Choi, T. Kim, and S. Choi, "Automatic Detection for JavaScript Obfuscation Attacks in Web Pages through String Pattern Analysis," International Journal of Security and Its Applications, Vol.4, No.2, pp.13-26, Apr. 2010.
  5. B. Feinstein and D. Peck, "Caffeine Monkey: Automated Collection, Detection and Analysis of Malicious JavaScript," Black Hat USA, 2007.
  6. B. Kim, C. Im, H. Jung, "Suspicious Malicious Web Site Detection with Strength Analysis of a JavaScript Obfuscation," International Journal of Advanced Science and Technology, vo.26, Jan, 2011.
  7. Christian Seifert, Ian Welch, Peter Komisarczuk, "Identification of Malicious Web Pages with Static Heuristics," Telecommunication Networks and Applications Conference, pp.91-96, Dec. 2008.
  8. Y.-T. Hou, Y. Chang, T. Chen, C.-S. Laih and C.-M. Chen, "Malicious Web Content Detection by Machine Learning," Expert Systems with Applications, Vol.378, pp.55-60, 2010.
  9. J. Lee, J. Moon, S. Cho, Y. Lee, M. Park and W. Choi, "Malicious Web Page Detection using Malicious Code Spreading Pattern," The 3rd International Conference on Internet (ICONI), pp.195-200, Dec. 2011.
  10. Simon Haykin, Neural Networks and Learning Machines, Chapter 6, Prentice Hall, 2008.
  11. Jun Yang and Alex Hauptmann, "A Framework for Classifier Adaptation and its Applications in Concept Detection," ACM Int'l Conf. on Multimedia Information Retrieval (MIR), Vancouver, Canada, 2008.
  12. B. Kim, C. Im, and H. Jung, "Suspicious Malicious Web Site Detection with Strength Analysis of a JavaScript Obfuscation," International Journal of Advanced Science and Technology, Vol.26, pp.19-32, Jan. 2011.

Cited by

  1. Feature Selection and Parameter Optimization of Support Vector Machines Based on Modified Artificial Fish Swarm Algorithms vol.2015, 2015, https://doi.org/10.1155/2015/604108
  2. Adaptive mechanism for schedule arrangement and optimization in socially-empowered professional sports games vol.74, pp.14, 2015, https://doi.org/10.1007/s11042-014-1852-2
  3. Secure Cooperative Spectrum Sensing for the Cognitive Radio Network Using Nonuniform Reliability vol.2014, 2014, https://doi.org/10.1155/2014/101809
  4. Physical Memory Collection and Analysis in Smart Grid Embedded System vol.19, pp.3, 2014, https://doi.org/10.1007/s11036-014-0504-0
  5. SAW Classification Algorithm for Chinese Text Classification vol.7, pp.3, 2015, https://doi.org/10.3390/su7032338
  6. Phishing Detection Methodology Using Web Sites Heuristic vol.4, pp.10, 2015, https://doi.org/10.3745/KTCCS.2015.4.10.349
  7. Multiple Minimum Support-Based Rare Graph Pattern Mining Considering Symmetry Feature-Based Growth Technique and the Differing Importance of Graph Elements vol.7, pp.3, 2015, https://doi.org/10.3390/sym7031151