A Comparison of Sample Size Requirements for Intraclass Correlation Coefficient(ICC) Han, Soo-Yeon; Nam, Jung-Mo; Myoung, Sung-Min; Song, Ki-Jun;
In medical practice and research, the problem of assessing reliability between two or more quantitative measures is quite common. Intraclass correlation coefficient(ICC) is commonly used to scale of reliability. Some methods were developed to calculate the required number of subjects, raters or replicates in one-way or two-way random ANOVA models. This paper, studies and compares the performance of four methods such as Walter et al. (1998), Giraudeau and Mary (2001), Saito et al. (2006) and Bonett (2002). In order to compare the efficiency of methods we compare the number of subjects, replicates and the width of confidence interval of ICC needed for some specific ICC values. In the case of subject size, Giraudeau's method is the best. In case of the number of replicates, Saito's method was superior to others. The width of confidence interval of ICC was narrower for Giraudeau's method than any others.
The Journal of Korean Physical Therapy, 2011. vol.23. 4, pp.59-66
Bonett, D. G. (2002). Sample size requirements for estimating intraclass correlations with desired precision, Statistics in Medicine, 21, 1331-1335.
Donner, A. and Eliasziw, M. (1987). Sample size requirements for reliability studies, Statistics in Medicine, 6, 441-448.
Donner, A. and Koval, J. J. (1983). A note on the accuracy of Fisher's approximation to the large sample variance of an intraclass correlation, Communications in Statistics-Computation and Simulation, 12, 443-449.
Fisher, R. A. (1954). Statistical Methods for Research Workers, 12th ed. Hafner, New York.
Giraudeau, B. and Mary, J. Y. (2001). Planning a reproducibility study, how many subjects and how many replicates per subject for an expected width of the 95 per cent confidence interval of the intraclass correlation coeffcient, Statistics in Medicine, 20, 3205-3214.
Johnson, N. L. and Kotz, S. (1970). Distributions in Statistics, Continuous Univariate Distrubution 2, John Wiley & Sons, Inc.
Landis, J. R. and Koch, G. G. (1977). The measurement of observer agreement for categorical data, Biometrics, 33, 159-174.
Saito, Y., Sozu, T., Hamada, C. and Yoshimura, I. (2006). Effective number of subjects and number of raters for inter-rater reliability studies, Statistics in Medicine, 25, 1547-1560.
Shrout, P. E. and Fleiss, J. L. (1979). Intraclass Correlation, uses in assessing rater reliability, Psychological Bulletin, 86, 420-428.
Rosner, B. (2005). Fundamentals of Biostatistics, 6th ed. Thomson Brooks/Cole.
Walter, S. D., Eliasziw, M. and Donner, A. (1998). Sample size and optimal designs for reliability studies, Statistics in Medicine, 17, 101-110.
White, S. A. and Broek, N. R. (2004). Methods for assessing reliability and validity for a measurement tool, a case study and critique using the WHO Haemoglobin Colour Scale, Statistics in Medicine, 23, 1603-1619.