A Three-Step Preprocessing Algorithm for Enhanced Classification of E-Mail Recommendation System

Jeong Ok-Ran;Cho Dong-Sub;

The Transactions of the Korean Institute of Electrical Engineers D (대한전기학회논문지:시스템및제어부문D)

Volume 54 Issue 4
/
Pages.251-258
/
2005
/
1229-6287(pISSN)

The Korean Institute of Electrical Engineers (대한전기학회)

A Three-Step Preprocessing Algorithm for Enhanced Classification of E-Mail Recommendation System

이메일 추천 시스템의 분류 향상을 위한 3단계 전처리 알고리즘

조동섭 (이화여자대학교 공대 컴퓨터학과) ;
정옥란 (이화여자대학교 공대 컴퓨터학과)

Published : 2005.04.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Automatic document classification may differ significantly according to the characteristics of documents that are subject to classification, as well as classifier's performance. This research identifies e-mail document's characteristics to apply a three-step preprocessing algorithm that can minimize e-mail document's atypical characteristics. In the first 5go, uncertain based sampling algorithm that used Mean Absolute Deviation(MAD), is used to address the question of selection learning document for the rule generation at the time of classification. In the subsequent stage, Weighted vlaue assigning method by attribute is applied to increase the discriminating capability of the terms that appear on the title on the e-mail document characteristic level. in the third and last stage, accuracy level during classification by each category is increased by using Naive Bayesian Presumptive Algorithm's Dynamic Threshold. And, we implemented an E-Mail Recommendtion System using a three-step preprocessing algorithm the enable users for direct and optimal classification with the recommendation of the applicable category when a mail arrives.

Keywords

References

Ok-Ran Jeong, Dong-Sub Cho, 'A Personalized Recommendation Agent System for E-Mail Document Classification' , Computational Science and Its Applications-ICCSA 2004, LNCS3045, Springer Verlag, Vol 3, pp.558-565, 2004 https://doi.org/10.1007/b98053
Ian H. written and Eibe Frank, 'Data Mining,' Morgan Kaufmann Publishers, Inc., 2000
Pedro Domingos and Michael Pazzani. 'Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier,' In Proceedings of the 13thInternational Conference on Machine Learning, pp105-112, 1996
F.Sebastiani, 'Machine Learning in Automated Text Categorization,' Technical Report IEI-B4-31-19
David D. Lewis and William A.Gale. A Sequential Algorithm for Training Text Classifiers. In Proceedings of the 17thAnnual International ACM -SIGIR Conference on Research and Development in Information Retrieval, pp. 3-12, 1994
David D. Lewis and Jason Catlett. Heterogeneous Uncertainty Sampling for Supervised Learning. In Proceedings of the 11th International Conference on Machine Learning, pages 148-156, 1994
M. Trensh, N. Palmer, and A. Luniewski. Type Classification of Semi-structured Documents. In Proceedings of the 21st ACM SIGMOD International Conference on Management of Data, 1995
강영순, 이용배, 김태현, 조숙현, 맹성현, '전자우편문서의 효율적인 분류을 위한 전처리', 제 29회 춘계학술발표회, 한국정보과학회, 제29권 제1호 pp. 493-495, 2002
정옥란, 조동섭, '개인화된 분류를 위한 웹 메일 필터링 에이전트', 정보처리학회논문지B, 제 10-B권 제7호, pp.853-862, 2003 https://doi.org/10.3745/KIPSTB.2003.10B.7.853
Tom Mitchell, MaGraw Hill, 'Machine Learning', McGRAW-HILL International Edition, 1997
M. Trensh, N. Palmer, and A. Luniewski, 'Type Classication of Semi-structured Documents,' In Proceedings of the 21st ACM SIGMOD International Conference on Management of Data, 1995
Yiming Yang, Jan O. Perdersen, 'A Comparative Study on Feature Selection in Text Cateforization', Proc. of ICML97, pp.412-420, 1997

The Transactions of the Korean Institute of Electrical Engineers D (대한전기학회논문지:시스템및제어부문D)

A Three-Step Preprocessing Algorithm for Enhanced Classification of E-Mail Recommendation System

이메일 추천 시스템의 분류 향상을 위한 3단계 전처리 알고리즘

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)