Advanced SearchSearch Tips
Permutation p-values for specific-category kappa measure of agreement
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Permutation p-values for specific-category kappa measure of agreement
Um, Yonghwan;
  PDF(new window)
Asymptotic tests are often not suitable for the analysis of sparse ordered contingency tables as asymptotic p-values may either overestimate or underestimate the true pvalues. In this pater, we describe permutation procedures in which we compute exact or resampling p-values for a weighted specific-category agreement in ordered contingency tables. We use the weighted specific-category kappa proposed by to measure the extent to which two independent raters agree on the specific categories. We carried out comparison studies between exact p-values, resampling p-values and asymptotic p-values using contingency data (real and artificial data sets) and artificial contingency data.
Contingency tables;permutation;p-values;weighted specific category agreement;
 Cited by
Agresti, A. (2002). Categorical data anaysis, 2nd Ed., Wiley, New York.

Berry, K. J., Johnston, J. E. and Mielke, P. W. (2006). Exact and resampling probability values for measures associated with ordered R by C contingency tables. Psychological Reports, 99, 231-238.

Cicchetti, D. V. and Allison, T. (1971). A new procedure for assessing reliability of scoring EEG sleep redordings. The American Journal of EEG Technology, 11, 101-109.

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46. crossref(new window)

Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213-220. crossref(new window)

Feinstein, A. R. and Cicchetti, D. V. (1990). High agreement but low kappa: I. The problems of two paradoxes. Journal of Clinical Epidemiology, 43, 543-549. crossref(new window)

Fisher R. A. (1935). A design of experiments, Oliver & Boyd, Edinburgh.

Fleiss, J. L. (1981). Statistical methods for rates and proportions, 2nd Ed., Wiley, New York.

Fleiss, J. L. and Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 2, 113-117.

Good, P. I. (2000). Permutation tests : A practical guide to resampling to resampling methods for testing hypotheses, 2nd Ed., springer-Verlag, New York.

Good, P. I. (2001). Resampling methods : A practical guide to data analysis, 2nd Ed., Birkhauser, Massachusetts.

Han, K. D. and Park Y. G. (2012). A simulation study of rater agreement measures. Journal of the Korean Data & Information Science Society, 23, 25-37. crossref(new window)

Holms, C. B. (1979). Sample size in psychological research. Perceptual and Motor Skills, 49, 283-288. crossref(new window)

Holms, C. B. (1990). The honest truth about lying with statistics, Thomas Springfield, Illinois.

Johnston, J. E., Berry, K. J. and Mielke, P. W. (2007). Permutation tests: Precision in estimating probability values. Perceptual and Motor Skills, 105, 915-920.

Johnston, J. E., Berry, K. J. and Mielke, P. W. (2008). Resampling permutation probability values for weighted kappa. Psychological Reports, 103, 467-475.

Kim, J. and Lee, J. D. (2014). Independence tests using coin package in R. Journal of the Korean Data & Information Science Society, 25, 1039-1055. crossref(new window)

Kraemer, H. C. (1983). Kappa coefficient. In Encyclopedia of Statistical Sciences 4, Wiley, New York, 352-354.

Kvalseth, T. O. (1989). Note on Cohen's kappa. Psychological Reports, 65, 223-226. crossref(new window)

Kvalseth, T. O. (2003). Weighted specific-category kappa measure of interobserver agreement. Psychological Reports, 93, 1283-1290. crossref(new window)

Mielke, P. W. and Berry, K. J. (2001). Permutation methods : A distance function approach. 2001, Springer-Verlag, New York.

Oleckno, W. A. (2008). Epidemiology : Concepts and methods, Waveland Press, Inc., Illinois.

Patefield, W. M. (1981). Algorithm AS 159: An efficient method of generating random R ${\time}$ C tables with given row and column totals. Journal of the Royal Statistical Society C, 30, 91-97.

Shoukri, M. M. (2004). Measures of intererobserver agreement, CRC Press, Florida.

Spitzer, R. L., Cohen, J., Fleiss, J. L. and Endicott, J. (1967). Quantization ofagreement in psychiatric diagnosis. Archives of General Psychiatry, 17, 83-87. crossref(new window)

Upton, G. and Cook, I. (2002). Oxford dictionary of statistics, Oxford University Press, United Kingdom.

Zhao, X. (2011). When to use Cohens K, if ever? International Communication Association 2011 Conference.