Advanced SearchSearch Tips
Mutual Information and Redundancy for Categorical Data
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Mutual Information and Redundancy for Categorical Data
Hong, Chong-Sun; Kim, Beom-Jun;
  PDF(new window)
Most methods for describing the relationship among random variables require specific probability distributions and some assumptions of random variables. The mutual information based on the entropy to measure the dependency among random variables does not need any specific assumptions. And the redundancy which is a analogous version of the mutual information was also proposed. In this paper, the redundancy and mutual information are explored to multi-dimensional categorical data. It is found that the redundancy for categorical data could be expressed as the function of the generalized likelihood ratio statistic under several kinds of independent log-linear models, so that the redundancy could also be used to analyze contingency tables. Whereas the generalized likelihood ratio statistic to test the goodness-of-fit of the log-linear models is sensitive to the sample size, the redundancy for categorical data does not depend on sample size but its cell probabilities itself.
Entropy;Goodness of fit;Joint Independence;Log-linear Model;Redundancy;
 Cited by
Abramson, N. (1963). Information Theory and Coding, Mcgraw Hill, New York

Bishop, Y.M.M., Fienberg, S.E., and Holland, P.W. (1975). Discrete Multivariate Analysis, Cambridge, Massachusetts: MIT Press

Brillinger, R. (2004). Some data analyses using mutual information, Brazilian Journal of Probability and Statistics, Vol. 18, 163-183

Brillinger, R and Guha A. (2006). Mutual Information in the Frequency Domain, Journal of Statistical Planning and Inference, To appear

Cover, T. and Thomas, J. (1991). Elements of Information Theory, John Wiley and Sons, New York

DeGroot, M.H. (1962). Uncertainty, information and sequential experiments. Annals of Mathematical Statistics, Vol. 33, 404-419 crossref(new window)

Fraser, A. and Swinney, H. (1986). Independent coordinates for strange attractors from mutual information. Physical Review, Vol. 33(2), 1134-1140 crossref(new window)

Gallager, R.G. (1968). Information Theory and Reliable Communication, John Wiley, New York

Gelfand, I.M. and Yaglom, A.M. (1959). Calculation of the amount of information about a random function contained in another such function. American Mathematical Society, Translations, Ser

Palus, M. (1993). Identifying and quantifying chaos by using information theoretic functions in time series prediction: Forecasting the Future and Understanding the Past. SantaFe Institute Studies in the Sciences of Complexity, Vol. 15, 387-413

Palus, M. and Pivka, D. (1995). Estimating predictability: Redundancy and surrogate data method. Neural Network World, Vol. 4, 537-552

Prichard, D. and Theiler, J (1995). Generating surrogate data for time series with several simultaneously measured variables. Physical Review. Vol. 73, 951-954

Shannon, C.E, (1948). A mathematical theory of communication. The Bell System Technical Journal, Vol. 27, 379-423 crossref(new window)

Wienholt, W. and Sendhoff, B. (1996), How to determine the redundancy of noisy chaotic time series. International Journal of Bifurcation and Chaos. Vol. 6. 101-117 crossref(new window)