k-Nearest Neighbor-Based Approach for the Estimation of Mutual Information

상호정보 추정을 위한 k-최근접이웃 기반방법

  • Cha, Woon-Ock (Department of Mutimedia Engineering, Hansung University) ;
  • Huh, Moon-Yul (Department of Statistics, Sungkyunkwan University)
  • 차운옥 (한성대학교 공과대학 멀티미디어공학과) ;
  • 허문열 (성균관대학교 통계학과)
  • Published : 2008.11.30


This study is about the k-nearest neighbor-based approach for the estimation of mutual information when the type of target variable is categorical and continuous. The results of Monte-Carlo simulation and experiments with real-world data show that k=1 is preferable. In practical application with real world data, our study shows that jittering and bootstrapping is needed.


  1. 허문열, 차운옥 (2008). Sample-spacing 방법에 의한 상호정보의 추정, <응용통계연구>, 21, 301-312
  2. Beirlant, J., Dudewicz, E. J., Gyor-, L. and Meulen, E. (1997). Nonparametric entropy estimation: An overview, International Journal of Mathematical and Statistical Sciences, 6, 17-39
  3. Blake, C. and Merz, C. J. (1998). UCI machine learning repository,
  4. Brillinger, D. R. (2004). Some data analyses using mutual information, Brazilian Journal of Probability and Satistics, 18, 163-183
  5. Cha, W. O. and Huh, M. Y. (2005). Discretization method based on quantiles for variable selection using mutual information, Communications of the Korean Statistical Society, 12, 659-672
  6. Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory, John Wiley & Sons, New York
  7. Huh, M. Y. (2005). DAVIS(Data visualization system),
  8. Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF, In Proceedings of European Conference on Machine Learning, 171-182
  9. Kraskov, A., Staugbauer, H. and Grassberger, P. (2004). Estimating Mutual Information, Physical Review E 69, 066138
  10. Lazo, A. V. and Rathie, P. (1978). On the entropy of continuous probability distributions, IEEE Transactions on Information Theory, 24, 120-122
  11. Miller, E. G. L. and Fisher III, J. W. (2003). ICA using spacings estimation of entropy, The Journal of Machine Learning Research, 4, 1271-1295
  12. Staugbauer, H., Kraskov, A., Astakhov, S. A. and Grassberger, P. (2004). Least dependent component analysis based on mutual information, Physical Review E 70, 066123