DOI QR코드

DOI QR Code

Comprehensive comparison of normality tests: Empirical study using many different types of data

  • Lee, Chanmi (Department of Statistics, Chonnam National University) ;
  • Park, Suhwi (Department of Statistics, Chonnam National University) ;
  • Jeong, Jaesik (Department of Statistics, Chonnam National University)
  • Received : 2016.07.04
  • Accepted : 2016.08.05
  • Published : 2016.09.30

Abstract

We compare many normality tests consisting of different sources of information extracted from the given data: Anderson-Darling test, Kolmogorov-Smirnov test, Cramervon Mises test, Shapiro-Wilk test, Shaprio-Francia test, Lilliefors, Jarque-Bera test, D'Agostino' D, Doornik-Hansen test, Energy test and Martinzez-Iglewicz test. For the purpose of comparison, those tests are applied to the various types of data generated from skewed distribution, unsymmetric distribution, and distribution with different length of support. We then summarize comparison results in terms of two things: type I error control and power. The selection of the best test depends on the shape of the distribution of the data, implying that there is no test which is the most powerful for all distributions.

Keywords

References

  1. Anderson, T. W. (1962). On the distribution of the two-sample Cramer-von Mises criterion. The Annals of Mathematical Statistics, 33, 1148-1159. https://doi.org/10.1214/aoms/1177704477
  2. Anscombe, F. J. and Glynn, W. J. (1983). Distribution of the kurtosis statistic $b_2$ for normal samples. Biometrika, 70, 227-234.
  3. Bajgier, S. M. and Aggarwal, L. K. (1991). Powers of goodness-of-fit tests in detecting balanced mixed normal distributions. Educational and Psychological Measurement Summer, 51, 253-269. https://doi.org/10.1177/0013164491512001
  4. Bowman, K. O. and Shenton, L. R. (1975). Omnibus contours for departures from normality based on ${\sqrt{b_1}}$ and $b_2$. Biometrika, 43, 243-250.
  5. Conover, W. J. (1999). Practical nonparametric statistics, 3rd Ed., Wiley, New York.
  6. Cramer, H. (1928). On the composition of elementary errors. Scandinavian Actuarial Journal, 1, 13-74.
  7. D'Agostino, R. B. (1970). Transformation to normality of the null distribution of $g_1$. Biometrika, 57, 679-681.
  8. D'Agostino, R. B. and Pearson, E. S. (1973). Tests for departure from normality. Empirical results for the distribution of $b_2$ and ${\sqrt{b_1}}$, Biometrika, 60, 613-622.
  9. D'Agostino, R. B. and Tietjen, G. L. (1973). Approaches to the null distribution ${\sqrt{b_1}}$. Biometrika, 60, 169-173.
  10. Doornik, J. A. and Hansen, H. (2008). An omnibus test for univariate and multivariate normality. Oxford Bulletin of Economics and Statistics, 70, 927-939. https://doi.org/10.1111/j.1468-0084.2008.00537.x
  11. Jarque, C. M. and Bera, A. K. (1981). Efficient tests for normality, homoscedasticity and serial independence of regression residuals: Monte Carlo evidence. Economics Letters, 7, 313-318. https://doi.org/10.1016/0165-1765(81)90035-5
  12. Kang, S. B., Han, J. T. and Cho, Y. S. (2014). Goodness of fit test for the logistic distribution based on multiply type II censored samples. Journal of the Korean Data & Information Science Society, 25, 195-209. https://doi.org/10.7465/jkdi.2014.25.1.195
  13. Kolmogorov, A. (1933). Sulla determinazione empirica di una legge di distribuzionc. 1st Itali Attuari, 4, 1-11.
  14. Lee, H. Y. (2013). Goodness-of-fit tests for a proportional odds model. Journal of the Korean Data & Information Science Society, 24, 1465-1475. https://doi.org/10.7465/jkdi.2013.24.6.1465
  15. Lilliefors, H. (1967). On the Kolmogorov-Smirnov tests with mean and variance unknown. Journal of the American Statistical Association, 62, 399-402. https://doi.org/10.1080/01621459.1967.10482916
  16. Lilliefors, H. (1969). On the Kolmogorov-Smirnov tests for the exponential distribution with mean unknown. Journal of the American Statistical Association, 64, 387-389. https://doi.org/10.1080/01621459.1969.10500983
  17. Martinez, J. and Iglewicz, B. (1981). A test for departure from normality based on a biweight estimator. Biometrika, 68, 331-333. https://doi.org/10.1093/biomet/68.1.331
  18. Pearson, E. S. and Hartely, H. O. (1966). Biometrika tables for statisticians, 3rd Ed., Cambridge, New York.
  19. Rahman, M. and Govidarajulu, Z. (1997). A modification of the test of Shapiro and Wilk for normality. Journal of Applied Statistics, 24, 219-236. https://doi.org/10.1080/02664769723828
  20. Royston, J. P. (1983). A simple method for evaluating the Shapiro-Francia W' test of non-normality. The Statistician, 32, 297-300. https://doi.org/10.2307/2987935
  21. Royston, J. P. (1992). Approximating the Shapiro-Wilk test for non-normality. Statistics and computing, 2, 117-119. https://doi.org/10.1007/BF01891203
  22. Shapiro, S. S. and Wilk, M. B. (1965). An analysis of variance test for normality. Biometrika, 52, 591-611. https://doi.org/10.1093/biomet/52.3-4.591
  23. Shapiro, S. S. and Francia, R. S. (1972). An approximate analysis of variance test for normality. Journal of the American Statistical Association, 67, 215-216. https://doi.org/10.1080/01621459.1972.10481232
  24. Smirnov, N. V. (1939). On the estimation of the discrepancy between empirical curves of distribution for the two independent samples. Bulletin Mathematique de L'Universite de Moscow, 2, 3-14.
  25. Smirnov, N. V. (1948). Table for estimating the goodness of fit of empirical distributions. The Annals of Mathematical Statistics, 19, 279-281. https://doi.org/10.1214/aoms/1177730256
  26. Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69, 730-737. https://doi.org/10.1080/01621459.1974.10480196
  27. Stephens, M. A. (1976). Asymptotic results for goodness-of-fit statistics with unknown parameters. Annals of Statistics, 4, 357-369. https://doi.org/10.1214/aos/1176343411
  28. Stephens, M. A. (1977). Goodness of fit for the extreme value distribution. Biometrika, 64, 583-588. https://doi.org/10.1093/biomet/64.3.583
  29. Szekely, G. J. and Rizzo, M. L. (2005). A new test for multivariate normality. Journal of Multivariate Aanalysis, 93, 58-80. https://doi.org/10.1016/j.jmva.2003.12.002
  30. von Mises, R. E. (1928). Wahrscheinlichkeit, Statistik und Wahrheit, Julius Springer, Vienna, Austria.
  31. Wilson, E. B. and Hilferty, M. M. (1931). The distribution of chi-square. Proceedings of the National Academy of Sciences of the United States of America, 17, 684-688. https://doi.org/10.1073/pnas.17.12.684