DOI QR코드

DOI QR Code

Influence diagnostics for skew-t censored linear regression models

  • Marcos S Oliveira (Department of Mathematics and Statistics, Federal University of Sao Joao del-Rei) ;
  • Daniela CR Oliveira (Department of Mathematics and Statistics, Federal University of Sao Joao del-Rei) ;
  • Victor H Lachos (Department of Statistics, University of Connecticut)
  • Received : 2022.12.26
  • Accepted : 2023.10.03
  • Published : 2023.11.30

Abstract

This paper proposes some diagnostics procedures for the skew-t linear regression model with censored response. The skew-t distribution is an attractive family of asymmetrical heavy-tailed densities that includes the normal, skew-normal and student's-t distributions as special cases. Inspired by the power and wide applicability of the EM-type algorithm, local and global influence analysis, based on the conditional expectation of the complete-data log-likelihood function are developed, following Zhu and Lee's approach. For the local influence analysis, four specific perturbation schemes are discussed. Two real data sets, from education and economics, which are right and left censoring, respectively, are analyzed in order to illustrate the usefulness of the proposed methodology.

Keywords

Acknowledgement

We thank the associate editor and four anonymous referees for their important comments and suggestions which lead to an improvement of this paper. The research of Marcos S. Oliveira was supported by Grant no. 401418/2022-7 from Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq) - Brazil. The research of Daniela C. R. Oliveira was supported by Grant no. 401373/2022-3 from CNPq - Brazil. Victor Lachos acknowledges the partial financial support from UConn - CLAS's Summer Research Funding Initiative 2023.

References

  1. Azzalini A (1985). A class of distributions which includes the normal ones, Scandinavian Journal of Statistics, 12, 171-178.
  2. Barros M, Galea M, Gonzalez M, and Leiva V (2010). Influence diagnostics in the Tobit censored response model, Statistical Methods & Applications, 19, 716-723.
  3. Barros M, Galea M, Leiva V, and Santos-Neto M (2018). Generalized Tobit models: Diagnostics and application in econometrics, Journal of Applied Statistics, 45, 145-167. https://doi.org/10.1080/02664763.2016.1268572
  4. Berkane M, Kano Y, and Bentler PM (1994). Pseudo maximum likelihood estimation in elliptical theory: Effects of misspecification, Computational Statistics and Data Analysis, 18, 255-267. https://doi.org/10.1016/0167-9473(94)90175-9
  5. Cameron AC and Trivedi PK (2005). Microeconometrics: Methods and Applications, Cambridge University Press, Cambridge.
  6. Cook RD (1986). Assessment of local influence, Journal of the Royal Statistical Society, Series B, 48, 133-169. https://doi.org/10.1111/j.2517-6161.1986.tb01398.x
  7. Cook RD and Weisberg S (1982). Residuals and Influence in Regression, Chapman & Hall/CRC, Boca Raton, FL.
  8. Kleiber C and Zeileis A (2008). Applied Econometrics with R, Springer-Verlag, New York, Available from: ISBN 978-0-387-77316-2
  9. Lachos VH, Garay A, and Cabral CR (2020). Moments of truncated skew-normal/independent distributions, Brazilian Journal of Probability and Statistics, 34, 478-494. https://doi.org/10.1214/19-BJPS438
  10. Lachos VH, Prates MO, and Dey DK (2021). Heckman selection-t model: Parameter estimation via the EM-algorithm, Journal of Multivariate Analysis, 184, 104737.
  11. Lachos VH, Bazan JL, Castro LM, and Park J (2022). The skew-t censored regression model: Parameter estimation via an EM-type algorithm, Commnications for Statistical Applications and Methods, 29, 333-351. https://doi.org/10.29220/CSAM.2022.29.3.333
  12. Lange KL, Little R, and Taylor J (1989). Robust statistical modeling using t distribution, Journal of the American Statistical Association, 84, 881-896. https://doi.org/10.1080/01621459.1989.10478852
  13. Lee SY and Xu L (2004). Rinfluence analysis of nonlinear mixed-effects models, Computational Statistics and Data Analysis, 45, 321-341. https://doi.org/10.1016/S0167-9473(02)00303-1
  14. Lucas A (1997). Robustness of the student-t based M-estimator, Communications in Statistics-Theory and Methods, 26, 1165-1182. https://doi.org/10.1080/03610929708831974
  15. Massuia MB, Cabral CRB, Matos LA, and Lachos VH (2015). Influence diagnostics for student-t censored linear regression models, Statistics, 49, 1074-1094. https://doi.org/10.1080/02331888.2014.958489
  16. Massuia MB, Garay AM, Lachos VH, and Cabral CRB (2017). Bayesian analysis of cen- sored linear regression models with scale mixtures of skew-normal distributions, Statistics and Its Interface, 10, 425-439. https://doi.org/10.4310/SII.2017.v10.n3.a7
  17. Matos LA, Prates MO, Chen MH, and Lachos VH (2013). Likelihood based inference for linear and nonlinear mixed-effects models with censored response using the multivariate-t distribution, Statistica Sinica, 23, 1323-1345. https://doi.org/10.5705/ss.2012.043
  18. Matos LA, Bandyopadhyay D, Castro LM, and Lachos VH (2015). Influence assessment in censored mixed-effects models using the multivariate student-t distribution, Journal of Multivariate Analysis, 141, 104-117. https://doi.org/10.1016/j.jmva.2015.06.014
  19. Mattos TdB, Garay AM, and Lachos VH (2018). Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions, Journal of Applied Statistics, 45, 2039-2066. https://doi.org/10.1080/02664763.2017.1408788
  20. Mroz TA (1987). The sensitivity of an empirical model of married women's hours of work to economic and statistical assumptions, Econometrica, 55, 765-799. https://doi.org/10.2307/1911029
  21. Nunez LM, Lachos VH, Galarza CE, and Matos LA (2021). Estimation and diagnostics for partially linear censored regression models based on heavy-tailed distributions, Statistics and Its Interface, 14, 165-182. https://doi.org/10.4310/20-SII624
  22. Ortega EMM, Bolfarine H, and Paula GA (2003). Influence diagnostics in generalized log-gamma regression models, Computational Statistics and Data Analysis, 42, 165-186. https://doi.org/10.1016/S0167-9473(02)00104-4
  23. Osorio F, Paula GA, and Galea M (2007). Assessment of local influence in elliptical linear models with longitudinal structure, Computational Statistics and Data Analysis, 51, 4354-4368. https://doi.org/10.1016/j.csda.2006.06.004
  24. Poon WY and Poon YS (1999). Conformal normal curvature and assessment of local influence, Journal of the Royal Statistical Society, Series B, 61, 51-61. https://doi.org/10.1111/1467-9868.00162
  25. R Core Team (2022). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna.
  26. RTI-FDA (2008). Snapshot of School Management Effectiveness: Peru Pilot Study (Technical report), USAID.
  27. Therneau TM, Grambsch PM, and Fleming TR (1990). Martingale-based residuals for survival models, Biometrika, 77, 147-160. https://doi.org/10.1093/biomet/77.1.147
  28. Valeriano KA, Galarza CE, Matos LA, and Lachos VH (2023). Likelihood-based inference for the multivariate skew-t regression with censored or missing responses, Journal of Multivariate Analysis, 196, 105174.
  29. Zeller CB, Cabral CRB, Lachos VH, and Benites L (2019). Finite mixture of regression models for censored data based on scale mixtures of normal distributions, Advances in Data Analysis and Classification, 13, 89-116. https://doi.org/10.1007/s11634-018-0337-y
  30. Zhu H and Lee S (2001). Local influence for incomplete-data models, Journal of the Royal Statistical Society, Series B, 63, 111-126. https://doi.org/10.1111/1467-9868.00279
  31. Zhu H, Lee S, Wei B, and Zhou J (2001). Case-deletion measures for models with incomplete data, Biometrika, 88, 727-737. https://doi.org/10.1093/biomet/88.3.727
  32. Zhu H, Ibrahim JG, and Shi X (2009). Diagnostic measures for generalized linear models with missing covariates, Scandinavian Journal of Statistics, 36, 686-712. https://doi.org/10.1111/j.1467-9469.2009.00644.x