DOI QR코드

DOI QR Code

Diagnostics for the Cox model

  • Xue, Yishu (Department of Statistics, University of Connecticut) ;
  • Schifano, Elizabeth D. (Department of Statistics, University of Connecticut)
  • Received : 2017.11.09
  • Accepted : 2017.11.13
  • Published : 2017.11.30

Abstract

The most popular regression model for the analysis of time-to-event data is the Cox proportional hazards model. While the model specifies a parametric relationship between the hazard function and the predictor variables, there is no specification regarding the form of the baseline hazard function. A critical assumption of the Cox model, however, is the proportional hazards assumption: when the predictor variables do not vary over time, the hazard ratio comparing any two observations is constant with respect to time. Therefore, to perform credible estimation and inference, one must first assess whether the proportional hazards assumption is reasonable. As with other regression techniques, it is also essential to examine whether appropriate functional forms of the predictor variables have been used, and whether there are any outlying or influential observations. This article reviews diagnostic methods for assessing goodness-of-fit for the Cox proportional hazards model. We illustrate these methods with a case-study using available R functions, and provide complete R code for a simulated example as a supplement.

Keywords

References

  1. Arjas E (1988). A graphical method for assessing goodness of fit in Cox's proportional hazards model, Journal of the American Statistical Association, 83, 204-212. https://doi.org/10.1080/01621459.1988.10478588
  2. Barlow WE (1997). Global measures of local influence for proportional hazards regression models, Biometrics, 53, 1157-1162. https://doi.org/10.2307/2533574
  3. Barlow WE and Prentice RL (1988). Residuals for relative risk regression, Biometrika, 75, 65-74. https://doi.org/10.1093/biomet/75.1.65
  4. Breslow N (1974). Covariance analysis of censored survival data, Biometrics, 30, 89-99. https://doi.org/10.2307/2529620
  5. Cai Z and Sun Y (2003). Local linear estimation for time-dependent coefficients in Cox's regression models, Scandinavian Journal of Statistics, 30, 93-111. https://doi.org/10.1111/1467-9469.00320
  6. Cain KC and Lange NT (1984). Approximate case influence for the proportional hazards regression model with censored data, Biometrics, 40, 493-499. https://doi.org/10.2307/2531402
  7. Caplan DJ, Li Y, Wang W, et al. (2017). Restoration longevity among geriatric and adult special needs patients, bioRxiv, https://doi.org/10.1101/202069.
  8. Chappell R (1992). A note on linear rank tests and Gill and Schumacher's tests of proportionality, Biometrika, 79, 199-201. https://doi.org/10.1093/biomet/79.1.199
  9. Chen K, Lin H, and Zhou Y (2012). Efficient estimation for the Cox model with varying coefficients, Biometrika, 99, 379-392. https://doi.org/10.1093/biomet/asr081
  10. Collett D (2015). Modelling Survival Data in Medical Research, CRC press, London.
  11. Cook RD (1986). Assessment of local influence, Journal of the Royal Statistical Society Series B (Methodological), 48, 133-169. https://doi.org/10.1111/j.2517-6161.1986.tb01398.x
  12. Cox DR (1972). Regression models and life-tables (with discussion), Journal of the Royal Statistical Society Series B (Methodological), 34, 187-220. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x
  13. Cox DR (1975). Partial likelihood, Biometrika, 62, 269-276. https://doi.org/10.1093/biomet/62.2.269
  14. Cox DR (1979). A note on the graphical analysis of survival data, Biometrika, 66, 188-190. https://doi.org/10.1093/biomet/66.1.188
  15. Cox DR and Snell EJ (1968). A general definition of residuals, Journal of the Royal Statistical Society Series B (Methodological), 30, 248-275. https://doi.org/10.1111/j.2517-6161.1968.tb00724.x
  16. Crowley J and Hu M (1977). Covariance analysis of heart transplant survival data, Journal of the American Statistical Association, 72, 27-36. https://doi.org/10.1080/01621459.1977.10479903
  17. Fan J, Lin H, and Zhou Y (2006). Local partial-likelihood estimation for lifetime data, The Annals of Statistics, 34, 290-325. https://doi.org/10.1214/009053605000000796
  18. Farrington CP (2000). Residuals for proportional hazards models with interval-censored survival data, Biometrics, 56, 473-482. https://doi.org/10.1111/j.0006-341X.2000.00473.x
  19. Finkelstein DM (1986). A proportional hazards model for interval-censored failure time data, Biometrics, 42, 845-854. https://doi.org/10.2307/2530698
  20. Fisher LD and Lin DY (1999). Time-dependent covariates in the Cox proportional-hazards regression model, Annual Review of Public Health, 20, 145-157. https://doi.org/10.1146/annurev.publhealth.20.1.145
  21. Fleming TR and Harrington DP (1991). Counting Processes and Survival Analysis, John Wiley & Sons, New York.
  22. Gill R and Schumacher M (1987). A simple test of the proportional hazards assumption, Biometrika, 74, 289-300. https://doi.org/10.1093/biomet/74.2.289
  23. Goggins WB and Finkelstein DM (2000). A proportional hazards model for multivariate interval-censored failure time data, Biometrics, 56, 940-943. https://doi.org/10.1111/j.0006-341X.2000.00940.x
  24. Grambsch PM (1995). Goodness-of-fit and diagnostics for proportional hazards regression models. In Recent Advances in Clinical Trial Design and Analysis (pp. 95-112), Springer, Boston.
  25. Grambsch PM and Therneau TM (1994). Proportional hazards tests and diagnostics based on weighted residuals, Biometrika, 81, 515-526. https://doi.org/10.1093/biomet/81.3.515
  26. Grant S, Chen YQ, and May S (2014). Performance of goodness-of-fit tests for the Cox proportional hazards model with time-varying covariates, Lifetime Data Analysis, 20, 355-368. https://doi.org/10.1007/s10985-013-9277-1
  27. Gronnesby JK and Borgan O (1996). A method for checking regression models in survival analysis based on the risk score, Lifetime Data Analysis, 2, 315-328. https://doi.org/10.1007/BF00127305
  28. Harrell FE (1986). The PHGLM procedure. In SUGI Supplemental Library Users Guide (5th ed, pp. 437-466), SAS Institute Inc., Cary.
  29. Hastie T and Tibshirani R (1993). Varying-coefficient models, Journal of the Royal Statistical Society Series B (Methodological), 55, 757-796. https://doi.org/10.1111/j.2517-6161.1993.tb01939.x
  30. Heller G (2010). Proportional hazards regression with interval censored data using an inverse probability weight, Lifetime Data Analysis, 17, 373-385.
  31. Henderson R and Milner A (1991). On residual plots for relative risk regression, Biometrika, 78, 631-636. https://doi.org/10.1093/biomet/78.3.631
  32. Hess KR (1995). Graphical methods for assessing violations of the proportional hazards assumption in Cox regression, Statistics in Medicine, 14, 1707-1723. https://doi.org/10.1002/sim.4780141510
  33. Jones BS and Branton RP (2005). Beyond logit and probit: Cox duration models of single, repeating, and competing events for state policy adoption, State Politics & Policy Quarterly, 5, 420-443. https://doi.org/10.1177/153244000500500406
  34. Kalbfleisch JD and Prentice RL (2002). The Statistical Analysis of Failure Time Data (2nd ed), John Wiley & Sons, New York.
  35. Kang H and Han ST (2004). Prediction of the probability of customer attrition by using Cox regression, Communications for Statistical Applications and Methods, 11, 227-233. https://doi.org/10.5351/CKSS.2004.11.2.227
  36. Kassambara A and Kosinski M (2017). survminer: Drawing Survival Curves Using "ggplot2" (R package version 0.4.0).
  37. Kay R (1977). Proportional hazard regression models and the analysis of censored survival data, Journal of the Royal Statistical Society Series C (Applied Statistics), 26, 227-237.
  38. Keele L (2010). Proportionally difficult: testing for nonproportional hazards in Cox models, Political Analysis, 18, 189-205. https://doi.org/10.1093/pan/mpp044
  39. Kotz S and Johnson NL (1992). Breakthrough in Statistics: Volume I, Foundations and Basic Theory, Springer-Verlag, Berlin.
  40. Kupets O (2006). Determinants of unemployment duration in Ukraine, Journal of Comparative Economics, 34, 228-247. https://doi.org/10.1016/j.jce.2006.02.006
  41. Lagakos SW (1981). The graphical evaluation of explanatory variables in proportional hazard regression models, Biometrika, 68, 93-98. https://doi.org/10.1093/biomet/68.1.93
  42. Lane WR, Looney SW, and Wansley JW (1986). An application of the Cox proportional hazards model to bank failure, Journal of Banking & Finance, 10, 511-531. https://doi.org/10.1016/S0378-4266(86)80003-6
  43. Lawless JF (2003). Statistical Models and Methods for Lifetime Data (2nd ed), John Wiley & Sons, New York.
  44. Lawrance AJ (1995). Deletion influence and masking in regression, Journal of the Royal Statistical Society Series B (Methodological), 57, 181-189. https://doi.org/10.1111/j.2517-6161.1995.tb02023.x
  45. Lin DY (1991). Goodness-of-fit analysis for the Cox regression model based on a class of parameter estimators, Journal of the American Statistical Association, 86, 725-728. https://doi.org/10.1080/01621459.1991.10475101
  46. Lin DY and Wei LJ (1989). The robust inference for the Cox proportional hazards model, Journal of the American Statistical Association, 84, 1074-1078. https://doi.org/10.1080/01621459.1989.10478874
  47. Lin DY, Wei LJ, and Ying Z (1993). Checking the Cox model with cumulative sums of martingale-based residuals, Biometrika, 80, 557-572. https://doi.org/10.1093/biomet/80.3.557
  48. Marzec L and Marzec P (1997a). Generalized martingale-residual processes for goodness-of-fit inference in Cox's type regression models, The Annals of Statistics, 25, 683-714. https://doi.org/10.1214/aos/1031833669
  49. Marzec L and Marzec P (1997b). On fitting Cox's regression model with time-dependent coefficients, Biometrika, 84, 901-908. https://doi.org/10.1093/biomet/84.4.901
  50. McCullagh P and Nelder JA (1983). Generalized Linear Models, Chapman & Hall, London.
  51. Moreau T, O'Quigley J, and Lellouch J (1986). On D. Schoenfeld's approach for testing the proportional hazards assumption, Biometrika, 73, 513-515.
  52. Moreau T, O'Quigley J, and Mesbah M (1985). A global goodness-of-fit statistic for the proportional hazards model, Applied Statistics, 34, 212-218. https://doi.org/10.2307/2347465
  53. Murphy SA and Sen PK (1991). Time-dependent coefficients in a Cox-type regression model, Stochastic Processes and Their Applications, 39, 153-180. https://doi.org/10.1016/0304-4149(91)90039-F
  54. Nagelkerke NJD, Oosting J, and Hart AAM (1984). A simple test for goodness of fit of Cox's proportional hazards model, Biometrics, 40, 483-486. https://doi.org/10.2307/2531400
  55. Nardi A and Schemper M (1999). New residuals for Cox regression and their application to outlier screening, Biometrics, 55, 523-529. https://doi.org/10.1111/j.0006-341X.1999.00523.x
  56. O'Quigley J and Pessione F (1989). Score tests for homogeneity of regression effect in the proportional hazards model, Biometrics, 45, 135-144. https://doi.org/10.2307/2532040
  57. Park S and Hendry DJ (2015). Reassessing Schoenfeld residual tests of proportional hazards in political science event history analyses, American Journal of Political Science, 59, 1072-1087. https://doi.org/10.1111/ajps.12176
  58. Pettitt AN and Daud IB (1989). Case-weighted measures of influence for proportional hazards regression, Journal of the Royal Statistical Society Series C (Applied Statistics), 38, 51-67.
  59. Reid N and Crepeau H (1985). Influence functions for proportional hazards regression, Biometrika, 72, 1-9. https://doi.org/10.1093/biomet/72.1.1
  60. Sargent DJ (1997). A flexible approach to time-varying coefficients in the Cox regression setting, Lifetime Data Analysis, 3, 13. https://doi.org/10.1023/A:1009612117342
  61. Schoenfeld D (1980). Chi-squared goodness-of-fit tests for the proportional hazards regression model, Biometrika, 67, 145-153. https://doi.org/10.1093/biomet/67.1.145
  62. Schoenfeld D (1982). Partial residuals for the proportional hazards regression model, Biometrika, 69, 239-241. https://doi.org/10.1093/biomet/69.1.239
  63. Storer BE and Crowley J (1985). A diagnostic for Cox regression and general conditional likelihoods, Journal of the American Statistical Association, 80, 139-147. https://doi.org/10.1080/01621459.1985.10477153
  64. Tang Y, Horikoshi M, and Li W (2016). ggfortify: Unified interface to visualize statistical result of popular R packages, The R Journal, 8, 478-489.
  65. Therneau TM (2017). A Package for Survival Analysis in S, (Version 2.41-3).
  66. Therneau T, Crowson C, and Atkinson E (2017). Using time dependent covariates and time dependent coefficients in the Cox model, Retrieved November 10, 2017, from: ftp://ftp.br.debian.org/CRAN/web/packages/survival/vignettes/timedep.pdf
  67. Therneau TM and Grambsch PM (2000). Modeling Survival Data: Extending the Cox Model, Springer-Verlag, Berlin.
  68. Therneau TM, Grambsch PM, and Fleming TR (1990). Martingale-based residuals for survival models, Biometrika, 77, 147-160. https://doi.org/10.1093/biomet/77.1.147
  69. Tian L, Zucker D, and Wei LJ (2005). On the Cox model with time-varying regression coefficients, Journal of the American Statistical Association, 100, 172-183. https://doi.org/10.1198/016214504000000845
  70. Verweij PJM and van Houwelingen HC (1995). Time-dependent effects of fixed covariates in Cox regression, Biometrics, 51, 1550-1556. https://doi.org/10.2307/2533286
  71. Wei LJ (1984). Testing goodness of fit for proportional hazards model with censored observations, Journal of the American Statistical Association, 79, 649-652. https://doi.org/10.1080/01621459.1984.10478092
  72. Wei WH and Kosorok MR (2000). Masking unmaskedinthe proportional hazards model, Biometrics, 56, 991-995. https://doi.org/10.1111/j.0006-341X.2000.0991.x
  73. Weissfeld LA (1990). Influence diagnostics for the proportional hazards model, Statistics & Probability Letters, 10, 411-417. https://doi.org/10.1016/0167-7152(90)90022-Y
  74. Wickham H (2009). ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag, New York.
  75. Winnett A and Sasieni P (2001). Miscellanea. A note on scaled Schoenfeld residuals for the proportional hazards model, Biometrika, 88, 565-571. https://doi.org/10.1093/biomet/88.2.565
  76. Xue X, Xie X, Gunter M, et al. (2013). Testing the proportional hazards assumption in case-cohort analysis, BMC Medical Research Methodology, 13, 88. https://doi.org/10.1186/1471-2288-13-88
  77. Zhu H, Ibrahim JG, and Chen MH (2015). Diagnostic measures for the Cox regression model with missing covariates, Biometrika, 102, 907-923. https://doi.org/10.1093/biomet/asv047