Rule-Based Classification Analysis Using Entropy Distribution

- Journal title : Communications for Statistical Applications and Methods
- Volume 17, Issue 4, 2010, pp.527-540
- Publisher : The Korean Statistical Society
- DOI : 10.5351/CKSS.2010.17.4.527

Title & Authors

Rule-Based Classification Analysis Using Entropy Distribution

Lee, Jung-Jin; Park, Hae-Ki;

Lee, Jung-Jin; Park, Hae-Ki;

Abstract

Rule-based classification analysis is widely used for massive datamining because it is easy to understand and its algorithm is uncomplicated. In this classification analysis, majority vote of rules or weighted combination of rules using their supports are frequently used in order to combine rules. We propose a method to combine rules by using the multinomial distribution in this paper. Iterative proportional fitting algorithm is used to estimate the multinomial distribution which maximizes entropy constrained on rules' support. Simulation experiments show that this method can compete with other well known classification models in the case of two similar populations.

Keywords

Rule-based classification analysis;maximum entropy distribution;iterative proportional fitting algorithm;

Language

Korean

References

1.

이정진 (2005). Discriminant analysis of binary data with multinomial distribution by using the iterative cross entropy minimization, <한국통계학회논문집>, 12, 125-137.

2.

이정진, 김수관 (2002). Classification analysis in information retrieval by using Gauss patterns, <한국통계학회논문집>, 9, 1-11.

3.

이정진, 황준 (2003). Discriminant analysis of binary data by using the maximum entropy distribution, <한국통계학회논문집>, 10, 909-917.

4.

Asparoukhov, O. K. and Krzanowski, W. J. (2001). A comparison of discriminant procedures for binary variables, Computational Statistics and Data Analysis, 38, 139-160.

5.

Cramer, E. (2000). Probability measures with given marginals and conditionals: I-projections and conditional iterative proportional fitting, Statistics & Decisions, 18, 311-329.

6.

Duda, R. O., Hart, P. E. and Stork, D. G. (2001). Pattern Classification, Wiley, New York.

7.

Han, J. and Kamber, M. (2000). Data Mining Concepts and Technique, Elsevier.

8.

Ireland, C. T. and Kullback, S. (1968). Contingency tables with given marginals, Biometrika, 55, 179-188.

9.

Kantor, P. B. and Lee, J. J. (1998). Testing the maximum entropy principle for information retrieval, Journal of American Society for Information Science, 49, 557-566.

10.

Lachenbruch (1981). Discriminant Analysis, Prentice Hall.

11.

Liu, B., Hsu, W. and Ma, Y. (1998). Integrating classification and association rule mining, Proceeding 1998 International Conference Knowledge Discovery and Data Mining, 80-86, New York, August 1998.