DOI QR코드

DOI QR Code

통계적 기법을 이용한 스팸메시지 필터링 기법

A Technique of Statistical Message Filtering for Blocking Spam Message

  • 김성윤 (숭실대학교 SW특성화대학원) ;
  • 차태수 (숭실대학교 SW특성화대학원) ;
  • 박제원 (숭실대학교 SW특성화대학원) ;
  • 최재현 (숭실대학교 SW특성화대학원) ;
  • 이남용 (숭실대학교 SW특성화대학원)
  • 투고 : 2014.07.28
  • 심사 : 2014.09.22
  • 발행 : 2014.09.30

초록

Due to indiscriminately received spam messages on information society, spam messages cause damages not only to person but also to our community. Nowadays a lot of spam filtering techniques, such as blocking characters, are studied actively. Most of these studies are content-based spam filtering technologies through machine learning.. Because of a spam message transmission techniques are being developed, spammers have to send spam messages using term spamming techniques. Spam messages tend to include number of nouns, using repeated words and inserting special characters between words in a sentence. In this paper, considering three features, SPSS statistical program were used in parameterization and we derive the equation. And then, based on this equation we measured the performance of classification of spam messages. The study compared with previous studies FP-rate in terms of further minimizing the cost of product was confirmed to show an excellent performance.

키워드

참고문헌

  1. Androutsopoulos, I., J. Koutsias, K. Chandrinos, G. Paliouras, and C. Spyropoulos, "An evaluation of naive bayesian anti-spam filtering", Proceedings of the Workshop on Machine Learning in the New Information Age, 11th European Conference on Machine Learning, 2000.
  2. Bishop, C.M., Pattern Recognition and Machine Learning, Springer-Verlag, 2006.
  3. Drucker, H. and D. Wu, "Support vector machines for spam categorization", IEEE Transactions on Neural Networks, Vol.10, No.5, 1999.
  4. Duda, R.O., D.G. Stork, and P.E. Hart, Pattern Classification 2/E, Wiley-Interscience, 2000.
  5. Gong, M. and K. Lee, "A Spam Filter System based on Maximum Entropy Model Using Spamness Features and URL Features", Korea Information Processing Society, Vol.15, No.1, 2008.
  6. Joe, I. and H. Shim, "A SVM-based Spam Filtering System for Short Message Service", The Journal of Korea Information and Communication Society, Vol.34, No.9, 2009.
  7. Lee, H., J., Cho, M. Jung, and J. Moon, "An Approach to Detect Spam E-mail with Abnormal Character Composition", Journal of the Korea Institute of Information Security and Cryptology, Vol.18, No.6-A, 2008.
  8. Lee, S. and D. Choi, "Personalized Mobile Junk Message Filtering System", The Korea Contents Society, Vol.11, No.12, 2001, 122-135. https://doi.org/10.5392/JKCA.2011.11.12.122
  9. Rhee, S., A. Khil, and M. Kim, "A Spam Mail Classification Using Link Structure Analysis", Journal of the Korea Information Science Society, Vol.34, No.1, 2007.
  10. Seo, J., T. Shon, J. Seo, and J. Moon, "A study on the Filtering of Spam E-mail using n-Gram indexing and Support Vector Machine", Journal of the Korea Institute of Information Security and Cryptology, Vol.14, No.2, 2004.
  11. Taufiq, M., "Independent and Personal SMS Spam Filtering", Proc. Of 11th IEEE International Conference on Computer and Information Technology, 2011.
  12. Theodoridis, S. and K. Koutroumbas, Pattern recognition 3/E, Academic press, 2006, 13-116.