JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Mutual Information and Redundancy for Categorical Data
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Mutual Information and Redundancy for Categorical Data
Hong, Chong-Sun; Kim, Beom-Jun;
  PDF(new window)
 Abstract
Most methods for describing the relationship among random variables require specific probability distributions and some assumptions of random variables. The mutual information based on the entropy to measure the dependency among random variables does not need any specific assumptions. And the redundancy which is a analogous version of the mutual information was also proposed. In this paper, the redundancy and mutual information are explored to multi-dimensional categorical data. It is found that the redundancy for categorical data could be expressed as the function of the generalized likelihood ratio statistic under several kinds of independent log-linear models, so that the redundancy could also be used to analyze contingency tables. Whereas the generalized likelihood ratio statistic to test the goodness-of-fit of the log-linear models is sensitive to the sample size, the redundancy for categorical data does not depend on sample size but its cell probabilities itself.
 Keywords
Entropy;Goodness of fit;Joint Independence;Log-linear Model;Redundancy;
 Language
Korean
 Cited by
 References
1.
Abramson, N. (1963). Information Theory and Coding, Mcgraw Hill, New York

2.
Bishop, Y.M.M., Fienberg, S.E., and Holland, P.W. (1975). Discrete Multivariate Analysis, Cambridge, Massachusetts: MIT Press

3.
Brillinger, R. (2004). Some data analyses using mutual information, Brazilian Journal of Probability and Statistics, Vol. 18, 163-183

4.
Brillinger, R and Guha A. (2006). Mutual Information in the Frequency Domain, Journal of Statistical Planning and Inference, To appear

5.
Cover, T. and Thomas, J. (1991). Elements of Information Theory, John Wiley and Sons, New York

6.
DeGroot, M.H. (1962). Uncertainty, information and sequential experiments. Annals of Mathematical Statistics, Vol. 33, 404-419 crossref(new window)

7.
Fraser, A. and Swinney, H. (1986). Independent coordinates for strange attractors from mutual information. Physical Review, Vol. 33(2), 1134-1140 crossref(new window)

8.
Gallager, R.G. (1968). Information Theory and Reliable Communication, John Wiley, New York

9.
Gelfand, I.M. and Yaglom, A.M. (1959). Calculation of the amount of information about a random function contained in another such function. American Mathematical Society, Translations, Ser

10.
Palus, M. (1993). Identifying and quantifying chaos by using information theoretic functions in time series prediction: Forecasting the Future and Understanding the Past. SantaFe Institute Studies in the Sciences of Complexity, Vol. 15, 387-413

11.
Palus, M. and Pivka, D. (1995). Estimating predictability: Redundancy and surrogate data method. Neural Network World, Vol. 4, 537-552

12.
Prichard, D. and Theiler, J (1995). Generating surrogate data for time series with several simultaneously measured variables. Physical Review. Vol. 73, 951-954

13.
Shannon, C.E, (1948). A mathematical theory of communication. The Bell System Technical Journal, Vol. 27, 379-423 crossref(new window)

14.
Wienholt, W. and Sendhoff, B. (1996), How to determine the redundancy of noisy chaotic time series. International Journal of Bifurcation and Chaos. Vol. 6. 101-117 crossref(new window)