JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Improved Spam Filter via Handling of Text Embedded Image E-mail
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Improved Spam Filter via Handling of Text Embedded Image E-mail
Youn, Seongwook; Cho, Hyun-Chong;
  PDF(new window)
 Abstract
The increase of image spam, a kind of spam in which the text message is embedded into attached image to defeat spam filtering technique, is a major problem of the current e-mail system. For nearly a decade, content based filtering using text classification or machine learning has been a major trend of anti-spam filtering system. Recently, spammers try to defeat anti-spam filter by many techniques. Text embedding into attached image is one of them. We proposed an ontology spam filters. However, the proposed system handles only text e-mail and the percentage of attached images is increasing sharply. The contribution of the paper is that we add image e-mail handling capability into the anti-spam filtering system keeping the advantages of the previous text based spam e-mail filtering system. Also, the proposed system gives a low false negative value, which means that user's valuable e-mail is rarely regarded as a spam e-mail.
 Keywords
E-mail classification;OCR;Ontology;Spam filtering;
 Language
English
 Cited by
 References
1.
H. Lam, D. Yeung, “A Learning Approach to Spam Detection based on Social Networks,” In Proceedings of 4th Conference on E-mail and Anti-Spam, 2007.

2.
A. Pathak, S. Roy, Y. Hu, “A Case for a Spam-Aware Mail Server Architecture,” In Proceedings of 4th Conference on E-mail and Anti-Spam, 2007.

3.
Spam Filter Review, 2007. http://spam-filter-review.toptenreviews.com.

4.
G. Fumera, I. Pillai, F. Roli, “Spam Filtering Based On The Analysis Of Text Information Embedded Into Images,” Journal of Machine Learning Research, Volume 6, pp. 2699-2720, 2006.

5.
B. Biggio, G. Fumera, I. Pillai, F. Roli, “Image Spam Filtering Using Visual Information,” In Proceedings of ICIAP, pp. 105-110, 2007

6.
IBM X-Force Mid-Year Trend and Risk Report, “http://www-03.ibm.com/security/xforce/downloads.html”

7.
S. Youn, D. McLeod, “Spam E-mail Classification using an Adaptive Ontology,” In Journal of Software, Volume 2, No. 3, pp. 43-55, Sep 2007.

8.
A. Gupta, C. Singhal, S. Aggarwal, “Identification of Image Spam by Using Low Level & Metadata Features,” In International Journal of Network Security & ITS Applications, Volume 4, No. 2, Mar 2012.

9.
N. Woods, O. Longe, A. Roberts, “A Sobel Edge Detection Algorithm Based System for Analyzing and Classifying Image Based Spam,” In Journal of Emerging Trends in Computing and Information Sciences, Volume 3, No 4, Apr 2012.

10.
M. Dredze, R. Gevaryahu, A. Elias-Bachrach, “Learning Fast Classifiers for Image Spam,” In Proceedings of 4th Conference on E-mail and Anti-Spam, 2007.

11.
The CAPTCHA project, 2000. http://www.captcha.net.

12.
B. Byun, C. Lee, S. Webb, C. Pu, “A Discriminative Classifier Learning Approach to Image Modeling and Spam Image Identification,” In Proceedings of 4th Conference on E-mail and Anti-Spam, 2007.

13.
M. Sahami, S. Dumais, D. Heckerman, E. Horvitz, “A Bayesian approach to filtering junk e-mail,” In AAAI Technical Report WS-98-05, Madison, Wisconsin, 1998.

14.
P. Graham, “A plan for spam,” http://paulgraham.com/spam.html.

15.
L. Zhang, J. Zhu, T. Yao, “An evaluation of statistical spam filtering techniques,” In ACM Transactions on Asian Language Information Processing, Vol. 3, No. 4, pp. 243-269, 2004. crossref(new window)

16.
The SpamAssassin project. http://spamassassin.apache.org/.

17.
H. Aradhye, G. Myers, J. Herson, “Image analysis for efficient categorization of image-based spam e-mail,” In Proceedings of Int. Conf. Document Analysis and Recognition, pp. 914-918, 2005.

18.
Basheer Al-Duwairi, Ismail Khater, Omar Al-Jarrah, “Detecting Image Spam Using Texture Features”, International Journal for Information Security Research (IJISR), Volume 2, Issues 3/4, pp. 344-353, September / December 2012

19.
Abdolrahman Attar, Reza Moradi Rad, Reza Ebrahimi Atani, “A survey of image spamming and filtering techniques,” Artificial Intelligence Review, 40(1), pp. 71-105, 2013 crossref(new window)

20.
The JOCR. http://jocr.sourceforge.net/links.html.

21.
The SimpleOCR. http://www.simpleocr.com/.

22.
The Asprise OCR. http://asprise.com/product/ocr/selector.php.

23.
The WEKA. http://www.cs.waikato.ac.nz/ml/weka/.

24.
The RDF. http://www.w3.org/RDF/.

25.
The Jena. http://jena.sourceforge.net/.

26.
The The Brighmail AntiSpam by Symantec. http://www.symantec.com/business/products/overview.jsp?pcid=psc_msg_security&pvid=835_1.