Variable Selection Based on Mutual Information

Huh, Moon-Y.;Choi, Byong-Su;

doi:10.5351/CKSS.2009.16.1.143

Communications for Statistical Applications and Methods

Volume 16 Issue 1
/
Pages.143-155
/
2009
/
2287-7843(pISSN)
/
2383-4757(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

Variable Selection Based on Mutual Information

Huh, Moon-Y. (Dept. of Statistics, Sungkyunkwan Univ.) ;
Choi, Byong-Su (Dept. of Multimedia Engineering, Hansung Univ.)

Published : 2009.01.31

https://doi.org/10.5351/CKSS.2009.16.1.143 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Best subset selection procedure based on mutual information (MI) between a set of explanatory variables and a dependent class variable is suggested. Derivation of multivariate MI is based on normal mixtures. Several types of normal mixtures are proposed. Also a best subset selection algorithm is proposed. Four real data sets are employed to demonstrate the efficiency of the proposals.

Keywords

References

Battiti, R. (1994). Using mutual information for selecting features in supervised neural net learning, IEEE Transactions on Neural Networks, 5, 537-550 https://doi.org/10.1109/72.298224
Brillinger, D. R. (2004). Some data analyses using mutual information, Brazilian Journal of Proba-bility and Statistics, 18, 163-183
Christensen, R. (1997). Log-linear Models and Logistic Regression, Springer, New York
Collett, D. (2003). Modelling Binary Data, 2nd ed., Chapman & Hall/CRC
Cover, T. M. and Thomas, J. A. (1991). Element of Information Theory, John Wiley & Sons
Darbellay, G. A. (1999). An estimator of the mutual information based on a criterion for indepen-dence, Computational Statistics & Data Analysis, 32, 1-17 https://doi.org/10.1016/S0167-9473(99)00020-1
Fraley, C. and Raftery, A. E. (2002). MCLUST: Software for model-based clustering, density estima-tion and discriminant analysis, Technical report No. 415, Department of Statistics, University of Washington
Huh, M. Y. (1995). Exploring multidimensional data with the flipped empirical distribution function, Journal of Computational and Graphical Statistics, 4, 335-343 https://doi.org/10.2307/1390860
Huh, M. Y. and Song, K. Y. (2002). DAVIS: A Java-based data visualization system, Computational Statistics, 17, 411-423
Hutter, M. (2002). Distribution of mutual information, In Advances in Neural Information Processing Systems 14, editor T. G. Dietterich and S. Becker and Z. Ghahramani, MIT Press, Cambridge, MA, 399-406
Ihaka, R. and Gentleman, R. (1996). R: A language for data analysis and graphics, Journal of Com-putational and Graphical Statistics, 5, 299-314, http://www.r-project.org https://doi.org/10.2307/1390807
Joe, H. (1989). Relative entropy measures of multivariate dependence, Journal of the American Statistical Association, 84, I57-I64 https://doi.org/10.2307/2289859
Kojadinovic, I. (2005). Relevance measures for subset variable selection in regression problems based on k-additive mutual information, Computational Statistics & Data Analysis, 49, 1205-1227 https://doi.org/10.1016/j.csda.2004.07.026
Kononenko, I., Simec, E. and Robnik-Sikonja, M. (1997). Overcoming the myopia of inductive learning algorithms with RELIEFF, Applied Intelligence, 7, 39-55 https://doi.org/10.1023/A:1008280620621
Lee, S.-C. and Huh, M. Y. (2003). A measure of association for complex data, Computational Statistics & Data Analysis, 44, 211-222 https://doi.org/10.1016/S0167-9473(03)00031-8
Liu, H. and Motoda, H. (1998). Feature Extraction, Construction and Selection: A Data Mining Perspective, 2nd Printing, Kluwer Academic Publishers
Merz, C. J. and Murphy, P. M. (1996). UCI Repository of Machine Learning Databases, Department of Information and Computer Science, University of California, Irvine, $CA(http://www.ics.uci.edu/^{~} mlearn/MLRepository.html)$
Miller, A. J. (1990). Subset Selection in Regression, Chapman & Hall/CRC, London
Nguyen, H. S. and Skowron, A. (1995). Quantization of real value attributes. Proceedins of Second Joint Annual Conf. on Information Science, Wrightsville Beach, North Carolina, 34-37
Shannon, C. E. (1948). A mathematical theory of communication, Bell System Technical Journal, 27, 379-423 and 623-656 https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Torkkola, K. and Campbell, W. M. (2000). Mutual information in learning feature transformations, In Proceeding ICML'2000, The Seventeenth International Conference on Machine Learning
Tourassi, G. D., Frederick, E. D., Markey, M. K. and Floyd, C. E. Jr. (2001). Application of the mutual information criterion for feature selection in computer-aided diagnosis, Medicine Physicist, 28, 2394-2402 https://doi.org/10.1118/1.1418724
Wang, J. (2001). Generating daily changes in market variables using a multivariate mixture of normal distributions, Proceedings of the 33nd conference on Winter simulation, IEEE computer Society https://doi.org/10.1109/WSC.2001.977286
Witten, I. and Frank, E. (1999). Data Mining, Morgan and Kaufmann. http://www.cs. waikato.ac.nz/ml/weka

Communications for Statistical Applications and Methods

Variable Selection Based on Mutual Information

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)