Evaluation of Attribute Selection Methods and Prior Discretization in Supervised Learning Cha, Woon Ock; Huh, Moon Yul;
We evaluated the efficiencies of applying attribute selection methods and prior discretization to supervised learning, modelled by C4.5 and Naive Bayes. Three databases were obtained from UCI data archive, which consisted of continuous attributes except for one decision attribute. Four methods were used for attribute selection : MDI, ReliefF, Gain Ratio and Consistency-based method. MDI and ReliefF can be used for both continuous and discrete attributes, but the other two methods can be used only for discrete attributes. Discretization was performed using the Fayyad and Irani method. To investigate the effect of noise included in the database, noises were introduced into the data sets up to the extents of 10 or 20%, and then the data, including those either containing the noises or not, were processed through the steps of attribute selection, discretization and classification. The results of this study indicate that classification of the data based on selected attributes yields higher accuracy than in the case of classifying the full data set, and prior discretization does not lower the accuracy.