DOI QR코드

DOI QR Code

k-Nearest Neighbor-Based Approach for the Estimation of Mutual Information

상호정보 추정을 위한 k-최근접이웃 기반방법

  • Cha, Woon-Ock (Department of Mutimedia Engineering, Hansung University) ;
  • Huh, Moon-Yul (Department of Statistics, Sungkyunkwan University)
  • 차운옥 (한성대학교 공과대학 멀티미디어공학과) ;
  • 허문열 (성균관대학교 통계학과)
  • Published : 2008.11.30

Abstract

This study is about the k-nearest neighbor-based approach for the estimation of mutual information when the type of target variable is categorical and continuous. The results of Monte-Carlo simulation and experiments with real-world data show that k=1 is preferable. In practical application with real world data, our study shows that jittering and bootstrapping is needed.

References

  1. 허문열, 차운옥 (2008). Sample-spacing 방법에 의한 상호정보의 추정, <응용통계연구>, 21, 301-312 https://doi.org/10.5351/KJAS.2008.21.2.301
  2. Beirlant, J., Dudewicz, E. J., Gyor-, L. and Meulen, E. (1997). Nonparametric entropy estimation: An overview, International Journal of Mathematical and Statistical Sciences, 6, 17-39
  3. Blake, C. and Merz, C. J. (1998). UCI machine learning repository, http://www.ics.uci.edu/mlearn/MLRepository
  4. Brillinger, D. R. (2004). Some data analyses using mutual information, Brazilian Journal of Probability and Satistics, 18, 163-183
  5. Cha, W. O. and Huh, M. Y. (2005). Discretization method based on quantiles for variable selection using mutual information, Communications of the Korean Statistical Society, 12, 659-672 https://doi.org/10.5351/CKSS.2005.12.3.659
  6. Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory, John Wiley & Sons, New York
  7. Huh, M. Y. (2005). DAVIS(Data visualization system), http://stat.skku.ac.kr/myhuh/DAVIS.html
  8. Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF, In Proceedings of European Conference on Machine Learning, 171-182
  9. Kraskov, A., Staugbauer, H. and Grassberger, P. (2004). Estimating Mutual Information, Physical Review E 69, 066138
  10. Lazo, A. V. and Rathie, P. (1978). On the entropy of continuous probability distributions, IEEE Transactions on Information Theory, 24, 120-122 https://doi.org/10.1109/TIT.1978.1055832
  11. Miller, E. G. L. and Fisher III, J. W. (2003). ICA using spacings estimation of entropy, The Journal of Machine Learning Research, 4, 1271-1295 https://doi.org/10.1162/jmlr.2003.4.7-8.1271
  12. Staugbauer, H., Kraskov, A., Astakhov, S. A. and Grassberger, P. (2004). Least dependent component analysis based on mutual information, Physical Review E 70, 066123