JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Evaluation of Attribute Selection Methods and Prior Discretization in Supervised Learning
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Evaluation of Attribute Selection Methods and Prior Discretization in Supervised Learning
Cha, Woon Ock; Huh, Moon Yul;
  PDF(new window)
 Abstract
We evaluated the efficiencies of applying attribute selection methods and prior discretization to supervised learning, modelled by C4.5 and Naive Bayes. Three databases were obtained from UCI data archive, which consisted of continuous attributes except for one decision attribute. Four methods were used for attribute selection : MDI, ReliefF, Gain Ratio and Consistency-based method. MDI and ReliefF can be used for both continuous and discrete attributes, but the other two methods can be used only for discrete attributes. Discretization was performed using the Fayyad and Irani method. To investigate the effect of noise included in the database, noises were introduced into the data sets up to the extents of 10 or 20%, and then the data, including those either containing the noises or not, were processed through the steps of attribute selection, discretization and classification. The results of this study indicate that classification of the data based on selected attributes yields higher accuracy than in the case of classifying the full data set, and prior discretization does not lower the accuracy.
 Keywords
attribute selection;discretization;classification;
 Language
Korean
 Cited by
1.
Discretization Method Based on Quantiles for Variable Selection Using Mutual Information,;;

Communications for Statistical Applications and Methods, 2005. vol.12. 3, pp.659-672 crossref(new window)
 References
1.
Classification and regression trees, 1984.

2.
Intelligent Data Analysis, 1997.

3.
Pattern Reognition: A Statistical Approach, 1982.

4.
Machine Learning, 1992. vol.8. pp.87-192

5.
Benchmarking Attribute Selection Techniques for Data Mining, 2000.

6.
Journal of Computational and Graphical statistics, 1996. vol.5. 3, pp.299-314 crossref(new window)

7.
Proceed. of Nat'l Conf. of AI, 1992. pp.129-134

8.
Proceed. of European Conference on Machine Learning, 1994. pp.171-182

9.
Computational Statistics and Data Analysis, 2003. vol.44. Issue 1-2, pp.209-220

10.
Proceedings of the 13th International Conference on Machine Learning, 1996. pp.319-327

11.
Feature selection for Knowledge Discovery and Data Mining, 1998.

12.
UCI Repository of Machine Learning Databases, 1996.

13.
Machine Learning, 1986. vol.1. pp.81-106

14.
C4.5: Programs for machine learning, 1998.

15.
Data Mining, 1999.