The Exploratory Analysis for Spam Mail Data Using Correspondence Analysis

  • Published : 2005.11.30

Abstract

The number of electronic mail(E-mail) has been increased dramatically as a result of expanding internet and information technology. Although there are many conveniences of E-mail in the bright side, some serious problems occur because of E-mail in its dark side. One of the problems is spam-mail which is unsolicited mail and also called bulk mail. This paper presents a set of patterns of spam-mail occurrences within a week using the correspondence analysis. The correspondence analysis is an exploratory multivariate technique that converts data into a particular type of graphical display in which the rows and columns are depicted as points. One of the meaningful patterns is a great increment of adult and phishing related spam-mails at weekends so any spam-mail filters should be designed to cope with this pattern.

Keywords

References

  1. SAS 대응분석 최용석
  2. 다변량 수량화 허명회
  3. SPSS 다변량자료분석 허명회;양경숙
  4. Journal of American Society for Information Science and Technology v.54 Automating Survey Coding by Multiclass Text Categorization Techniques Giorgetti, D.;Sebastiani, F.
  5. A Plan for Spam Graham, P.
  6. Theory and Applications of Correspondence Analysis Greenacre, M.J.
  7. Journal of the American Statistical Association v.82 The geometric interpretation of correspondence analysis Greenacre, M.;Hastie, T.
  8. Towards an Adaptive Mail Classifier, Technical report Manco, G.;Macciari, E.;Ruffolo, M.;Tagarelli, A.
  9. SIGIR '01 An Experimental Framework for Email Categorization and Management Mock, K.
  10. New York Times
  11. Spam Detection Robinson, G.
  12. Article in the Linux Journal march 2003 issue 107 Robinson, G.
  13. IUI'02 Do Users Tolerate Errors from their Assistant?, Experiments with an E-mail Classifier Ruvini, J.;Gabriel, J.
  14. The Singular Value Decomposition in Data Analytic Multivariate Analysis Shin, Y.K.