DOI QR코드

DOI QR Code

특정 범주에 대한 평가자간 카파 일치도의 퍼뮤테이션 p값

Permutation p-values for specific-category kappa measure of agreement

  • 엄용환 (성결대학교 산업경영공학부)
  • Um, Yonghwan (Division of Industrial and Management Engineering, Sungkyul University)
  • 투고 : 2016.03.21
  • 심사 : 2016.07.13
  • 발행 : 2016.07.31

초록

근사검정은 종종 표본이 작은 순서척도의 범주를 갖는 분할표를 분석할 때 그 p값이 과대추정 되거나 과소추정 되기 때문에 적절하지 못한 것으로 여겨진다. 본 논문에서는 순서화된 범주를 갖는 $k{\times}k$ 분할표에서 특정 범주에 대한 가중 일치도에 대해 정확한 p값과 재표본 기법에 의해 p값을 구하는 퍼뮤테이션 방법을 제시한다. 이를 위해 두 명의 평가자가 특정의 범주에서 얼마나 일치된 평가를 하는 지를 측정하기 위해 $Kv{\dot{a}}lseth$가 제안한 특정 범주에 대한 가중 일치도 (weighted specific-category kappa)를 사용한다. 사례 데이터로서 $3{\times}3$ 분할표 형태의 실제 데이터와 가상데이터 그리고 $4{\times}4$ 분할표 형태의 가상데이터를 이용하며, 정확한 퍼뮤테이션 p값과 재표본 퍼뮤테이션 p값 그리고 근사검정의 p값을 계산하여 비교한다.

Asymptotic tests are often not suitable for the analysis of sparse ordered contingency tables as asymptotic p-values may either overestimate or underestimate the true pvalues. In this pater, we describe permutation procedures in which we compute exact or resampling p-values for a weighted specific-category agreement in ordered $k{\times}k$ contingency tables. We use the weighted specific-category kappa proposed by $Kv{\dot{a}}lseth$ to measure the extent to which two independent raters agree on the specific categories. We carried out comparison studies between exact p-values, resampling p-values and asymptotic p-values using $3{\times}3$ contingency data (real and artificial data sets) and $4{\times}4$ artificial contingency data.

키워드

참고문헌

  1. Agresti, A. (2002). Categorical data anaysis, 2nd Ed., Wiley, New York.
  2. Berry, K. J., Johnston, J. E. and Mielke, P. W. (2006). Exact and resampling probability values for measures associated with ordered R by C contingency tables. Psychological Reports, 99, 231-238. https://doi.org/10.2466/pr0.99.1.231-238
  3. Cicchetti, D. V. and Allison, T. (1971). A new procedure for assessing reliability of scoring EEG sleep redordings. The American Journal of EEG Technology, 11, 101-109. https://doi.org/10.1080/00029238.1971.11080840
  4. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46. https://doi.org/10.1177/001316446002000104
  5. Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213-220. https://doi.org/10.1037/h0026256
  6. Feinstein, A. R. and Cicchetti, D. V. (1990). High agreement but low kappa: I. The problems of two paradoxes. Journal of Clinical Epidemiology, 43, 543-549. https://doi.org/10.1016/0895-4356(90)90158-L
  7. Fisher R. A. (1935). A design of experiments, Oliver & Boyd, Edinburgh.
  8. Fleiss, J. L. (1981). Statistical methods for rates and proportions, 2nd Ed., Wiley, New York.
  9. Fleiss, J. L. and Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 2, 113-117.
  10. Good, P. I. (2000). Permutation tests : A practical guide to resampling to resampling methods for testing hypotheses, 2nd Ed., springer-Verlag, New York.
  11. Good, P. I. (2001). Resampling methods : A practical guide to data analysis, 2nd Ed., Birkhauser, Massachusetts.
  12. Han, K. D. and Park Y. G. (2012). A simulation study of rater agreement measures. Journal of the Korean Data & Information Science Society, 23, 25-37. https://doi.org/10.7465/jkdi.2012.23.1.025
  13. Holms, C. B. (1979). Sample size in psychological research. Perceptual and Motor Skills, 49, 283-288. https://doi.org/10.2466/pms.1979.49.1.283
  14. Holms, C. B. (1990). The honest truth about lying with statistics, Thomas Springfield, Illinois.
  15. Johnston, J. E., Berry, K. J. and Mielke, P. W. (2007). Permutation tests: Precision in estimating probability values. Perceptual and Motor Skills, 105, 915-920. https://doi.org/10.2466/pms.105.3.915-920
  16. Johnston, J. E., Berry, K. J. and Mielke, P. W. (2008). Resampling permutation probability values for weighted kappa. Psychological Reports, 103, 467-475. https://doi.org/10.2466/pr0.103.2.467-475
  17. Kim, J. and Lee, J. D. (2014). Independence tests using coin package in R. Journal of the Korean Data & Information Science Society, 25, 1039-1055. https://doi.org/10.7465/jkdi.2014.25.5.1039
  18. Kraemer, H. C. (1983). Kappa coefficient. In Encyclopedia of Statistical Sciences 4, Wiley, New York, 352-354.
  19. Kvalseth, T. O. (1989). Note on Cohen's kappa. Psychological Reports, 65, 223-226. https://doi.org/10.2466/pr0.1989.65.1.223
  20. Kvalseth, T. O. (2003). Weighted specific-category kappa measure of interobserver agreement. Psychological Reports, 93, 1283-1290. https://doi.org/10.2466/PR0.93.8.1283-1290
  21. Mielke, P. W. and Berry, K. J. (2001). Permutation methods : A distance function approach. 2001, Springer-Verlag, New York.
  22. Oleckno, W. A. (2008). Epidemiology : Concepts and methods, Waveland Press, Inc., Illinois.
  23. Patefield, W. M. (1981). Algorithm AS 159: An efficient method of generating random R ${\time}$ C tables with given row and column totals. Journal of the Royal Statistical Society C, 30, 91-97.
  24. Shoukri, M. M. (2004). Measures of intererobserver agreement, CRC Press, Florida.
  25. Spitzer, R. L., Cohen, J., Fleiss, J. L. and Endicott, J. (1967). Quantization ofagreement in psychiatric diagnosis. Archives of General Psychiatry, 17, 83-87. https://doi.org/10.1001/archpsyc.1967.01730250085012
  26. Upton, G. and Cook, I. (2002). Oxford dictionary of statistics, Oxford University Press, United Kingdom.
  27. Zhao, X. (2011). When to use Cohens K, if ever? International Communication Association 2011 Conference.