JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Variable Selection Based on Mutual Information
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Variable Selection Based on Mutual Information
Huh, Moon-Y.; Choi, Byong-Su;
  PDF(new window)
 Abstract
Best subset selection procedure based on mutual information (MI) between a set of explanatory variables and a dependent class variable is suggested. Derivation of multivariate MI is based on normal mixtures. Several types of normal mixtures are proposed. Also a best subset selection algorithm is proposed. Four real data sets are employed to demonstrate the efficiency of the proposals.
 Keywords
Best subset selection;feature selection;mutual information;normal mixture;
 Language
English
 Cited by
 References
1.
Battiti, R. (1994). Using mutual information for selecting features in supervised neural net learning, IEEE Transactions on Neural Networks, 5, 537-550 crossref(new window)

2.
Brillinger, D. R. (2004). Some data analyses using mutual information, Brazilian Journal of Proba-bility and Statistics, 18, 163-183

3.
Christensen, R. (1997). Log-linear Models and Logistic Regression, Springer, New York

4.
Collett, D. (2003). Modelling Binary Data, 2nd ed., Chapman & Hall/CRC

5.
Cover, T. M. and Thomas, J. A. (1991). Element of Information Theory, John Wiley & Sons

6.
Darbellay, G. A. (1999). An estimator of the mutual information based on a criterion for indepen-dence, Computational Statistics & Data Analysis, 32, 1-17 crossref(new window)

7.
Fraley, C. and Raftery, A. E. (2002). MCLUST: Software for model-based clustering, density estima-tion and discriminant analysis, Technical report No. 415, Department of Statistics, University of Washington

8.
Huh, M. Y. (1995). Exploring multidimensional data with the flipped empirical distribution function, Journal of Computational and Graphical Statistics, 4, 335-343 crossref(new window)

9.
Huh, M. Y. and Song, K. Y. (2002). DAVIS: A Java-based data visualization system, Computational Statistics, 17, 411-423

10.
Hutter, M. (2002). Distribution of mutual information, In Advances in Neural Information Processing Systems 14, editor T. G. Dietterich and S. Becker and Z. Ghahramani, MIT Press, Cambridge, MA, 399-406

11.
Ihaka, R. and Gentleman, R. (1996). R: A language for data analysis and graphics, Journal of Com-putational and Graphical Statistics, 5, 299-314, http://www.r-project.org crossref(new window)

12.
Joe, H. (1989). Relative entropy measures of multivariate dependence, Journal of the American Statistical Association, 84, I57-I64 crossref(new window)

13.
Kojadinovic, I. (2005). Relevance measures for subset variable selection in regression problems based on k-additive mutual information, Computational Statistics & Data Analysis, 49, 1205-1227 crossref(new window)

14.
Kononenko, I., Simec, E. and Robnik-Sikonja, M. (1997). Overcoming the myopia of inductive learning algorithms with RELIEFF, Applied Intelligence, 7, 39-55 crossref(new window)

15.
Lee, S.-C. and Huh, M. Y. (2003). A measure of association for complex data, Computational Statistics & Data Analysis, 44, 211-222 crossref(new window)

16.
Liu, H. and Motoda, H. (1998). Feature Extraction, Construction and Selection: A Data Mining Perspective, 2nd Printing, Kluwer Academic Publishers

17.
Merz, C. J. and Murphy, P. M. (1996). UCI Repository of Machine Learning Databases, Department of Information and Computer Science, University of California, Irvine, $CA(http://www.ics.uci.edu/^{~} mlearn/MLRepository.html)$

18.
Miller, A. J. (1990). Subset Selection in Regression, Chapman & Hall/CRC, London

19.
Nguyen, H. S. and Skowron, A. (1995). Quantization of real value attributes. Proceedins of Second Joint Annual Conf. on Information Science, Wrightsville Beach, North Carolina, 34-37

20.
Shannon, C. E. (1948). A mathematical theory of communication, Bell System Technical Journal, 27, 379-423 and 623-656 crossref(new window)

21.
Torkkola, K. and Campbell, W. M. (2000). Mutual information in learning feature transformations, In Proceeding ICML'2000, The Seventeenth International Conference on Machine Learning

22.
Tourassi, G. D., Frederick, E. D., Markey, M. K. and Floyd, C. E. Jr. (2001). Application of the mutual information criterion for feature selection in computer-aided diagnosis, Medicine Physicist, 28, 2394-2402 crossref(new window)

23.
Wang, J. (2001). Generating daily changes in market variables using a multivariate mixture of normal distributions, Proceedings of the 33nd conference on Winter simulation, IEEE computer Society crossref(new window)

24.
Witten, I. and Frank, E. (1999). Data Mining, Morgan and Kaufmann. http://www.cs. waikato.ac.nz/ml/weka