Advanced SearchSearch Tips
ModifiedFAST: A New Optimal Feature Subset Selection Algorithm
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
ModifiedFAST: A New Optimal Feature Subset Selection Algorithm
Nagpal, Arpita; Gaur, Deepti;
  PDF(new window)
Feature subset selection is as a pre-processing step in learning algorithms. In this paper, we propose an efficient algorithm, ModifiedFAST, for feature subset selection. This algorithm is suitable for text datasets, and uses the concept of information gain to remove irrelevant and redundant features. A new optimal value of the threshold for symmetric uncertainty, used to identify relevant features, is found. The thresholds used by previous feature selection algorithms such as FAST, Relief, and CFS were not optimal. It has been proven that the threshold value greatly affects the percentage of selected features and the classification accuracy. A new performance unified metric that combines accuracy and the number of features selected has been proposed and applied in the proposed algorithm. It was experimentally shown that the percentage of selected features obtained by the proposed algorithm was lower than that obtained using existing algorithms in most of the datasets. The effectiveness of our algorithm on the optimal threshold was statistically validated with other algorithms.
Entropy;Feature selection;Filter model;Graph-based clustering;Mutual information;
 Cited by
W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality. Hoboken, NJ: Wiley-Interscience, 2007.

J. Huang, Y. Cai, and X. Xu, “A filter approach to feature selection based on mutual information,” in Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI), Beijing, China, pp. 84-89, 2006.

M. A. Hall and L. A. Smith, “Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper,” in Proceedings of the 12th International Florida AI Research Society Conference, Orlando, FL, pp. 235-239, 1999.

R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, no. 1, pp. 273-324, 1997. crossref(new window)

L. Yu and H. Liu, “Feature selection for high-dimensional data: a fast correlation-based filter solution,” in Proceedings of the 20th International Conference on Machine Learning (ICML2003), Washington DC, pp. 856-863, 2003.

S. Das, “Filters, wrappers and a boosting-based hybrid for feature selection,” in Proceedings of the 18th International Conference on Machine Learning (ICML2001), Williamstown, MA, pp. 74-81, 2001.

D. Guan, W. Yuan, Y. K. Lee, K. Najeebullah, and M. K. Rasel, “A review of ensemble learning based feature selection,” IETE Technical Review, vol. 31, no. 3, pp. 190-198, 2014. crossref(new window)

H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491-502, 2005. crossref(new window)

K. Kira and L. A. Rendell, “The feature selection problem: traditional methods and a new algorithm,” in Proceedings of the 10th National Conference on Artificial Intelligence (AAAI), San Jose, CA, pp. 129-134, 1992.

H. Almuallim and T. G. Dietterich, “Efficient algorithms for identifying relevant features,” in Proceedings of the 9th Canadian Conference on Artificial Intelligence, pp. 1-8, 1992.

M. A. Hall, “Correlation-based feature selection for machine learning,” Ph.D. dissertation, The University of Waikato, Hamilton, New Zealand, 1999.

L. Yu and H. Liu, “Feature selection for high-dimensional data: a fast correlation-based filter solution,” in Proceedings of the 20th International Conference on Machine Learning (ICML2003), Washington, DC, pp. 856-863, 2003.

Q. Song, J. Ni, and G. Wang, “A fast clustering-based feature subset selection algorithm for high-dimensional data,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 1, pp. 1-14, 2013. crossref(new window)

J. M. Santos and S. Ramos, “Using a clustering similarity measure for feature selection in high dimensional data sets,” in Proceedings of the 10th International Conference on Intelligent Systems Design and Applications (ISDA), Cairo, Egypt, pp. 900-905, 2010.

J. W. Jaromczyk and G. T. Toussaint, “Relative neighborhood graphs and their relatives,” Proceedings of the IEEE, vol. 80, no. 9, pp. 1502-1517, 1992. crossref(new window)

G. T. Toussaint, “The relative neighbourhood graph of a finite planar set,” Pattern Recognition, vol. 12, no. 4, pp. 261-268, 1980. crossref(new window)

C. Zhong, D. Miao, and R. Wang, “A graph-theoretical clustering method based on two rounds of minimum spanning trees,” Pattern Recognition, vol. 43, no. 3, pp. 752-766, 2010. crossref(new window)

Y. Xu, V. Olman, and D. Xu, “Minimum spanning trees for gene expression data clustering,” Genome Informatics, vol. 12, pp. 24-33, 2001.

C. T. Zahn, “Graph-theoretical methods for detecting and describing gestalt clusters,” IEEE Transactions on Computers, vol. 100, no. 1, pp. 68-86, 1971. crossref(new window)

O. Grygorash, Y. Zhou, and Z. Jorgensen, “Minimum spanning tree based clustering algorithms,” in Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06), Arlington, VA, pp. 73-81, 2006.

J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, 2000. crossref(new window)

C. H. Ding, X. He, H. Zha, M. Gu, and H. D. Simon, “A min-max cut algorithm for graph partitioning and data clustering,” in Proceedings IEEE International Conference on Data Mining (ICDM 2001), San Jose, CA, pp. 107-114, 2001.

J. R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann, 1993.

T. M. Cover and J. A. Thomas, Elements of Information Theory. New York, NY: Wiley, pp. 12-49, 1991.

W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C. Cambridge: Cambridge University Press, 1988.

J. M. Sotoca and F. Pla, “Supervised feature selection by clustering using conditional mutual information-based distances,” Pattern Recognition, vol. 43, no. 6, pp. 2068-2081, 2010. crossref(new window)

A. Nagpal, D. Gaur, and S. Gaur, “Feature selection using mutual information for high-dimensional data sets,” in Proceedings of 2014 IEEE International Advance Computing Conference (IACC), Gurgaon, India, pp. 45-49, 2014.

M. Friedman, “A comparison of alternative tests of significance for the problem of m rankings,” Annals of Mathematical Statistics, vol. 11, no. 1, pp. 86-92, 1940. crossref(new window)

P. Nemenyi, “Distribution-free multiple comparisons,” Ph.D. dissertation, Princeton University, NJ, 1963.