DOI QR코드

DOI QR Code

Case influence diagnostics for the significance of the linear regression model

  • Bae, Whasoo (Department of Statistics, Inje University) ;
  • Noh, Soyoung (Department of Statistics, Pusan National University) ;
  • Kim, Choongrak (Department of Statistics, Pusan National University)
  • Received : 2016.12.16
  • Accepted : 2017.01.20
  • Published : 2017.03.31

Abstract

In this paper we propose influence measures for two basic goodness-of-fit statistics, the coefficient of determination $R^2$ and test statistic F in the linear regression model using the deletion method. Some useful lemmas are provided. We also express the influence measures in terms of basic building blocks such as residual, leverage, and deviation that showed them as increasing function of residuals and a decreasing function of deviation. Further, the proposed measure reduces computational burden from O(n) to O(1). As illustrative examples, we applied the proposed measures to the stackloss data sets. We verified that deletion of one or few influential observations may result in big change in $R^2$ and F-statistic.

Keywords

References

  1. Bae W, Hwang S, and Kim C (2008). Influence diagnostics in the varying coefficient model with longitudinal data, Computational Statistics, 23, 185-196. https://doi.org/10.1007/s00180-007-0025-4
  2. Belsley DA, Kuh E, and Welsch RE (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, Wiley, New York.
  3. Box GEP and Cox DR (1964). An analysis of transformations, Journal of the Royal Statistical Society Series B (Methodological), 26, 211-252. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
  4. Chatterjee S and Hadi AS (1986). Influential observations, high leverage points, and outliers in linear regression, Statistical Science, 1, 379-393. https://doi.org/10.1214/ss/1177013622
  5. Cook RD (1977). Detection of influential observation in linear regression, Technometrics, 19, 15-18.
  6. Cook RD and Wang PC (1983). Transformations and influential case in regression, Technometrics, 25, 337-343. https://doi.org/10.1080/00401706.1983.10487896
  7. Cook RD and Weisberg S (1982). Characterizations of an empirical influence function for detecting influential cases in regression, Technometrics, 22, 495-508.
  8. Eubank RL (1985). Diagnostics for smoothing splines, Journal of the Royal Statistical Society Series B (Methodological), 47, 332-341. https://doi.org/10.1111/j.2517-6161.1985.tb01361.x
  9. Fung WK, Zhu ZY, Wei BC, and He X (2002). Influence diagnostics and outlier tests for semipara-metric mixed models, Journal of the Royal Statistical Society Series B (Statistical Methodology), 64, 565-579. https://doi.org/10.1111/1467-9868.00351
  10. Hinkley DV and Wang S (1988). More about transformations and influential cases in regression, Technometrics, 30, 435-440. https://doi.org/10.1080/00401706.1988.10488439
  11. Hoerl AE and Kennard RW (1970). Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634
  12. Kim C (1996). Local influence and replacement measure, Communications in Statistics - Theory and Methods, 25, 49-61. https://doi.org/10.1080/03610929608831679
  13. Kim C, Storer BE, and JeongM(1996). Note on Box-Cox transformation diagnostics, Technometrics, 38, 178-180.
  14. Kim C, Lee Y, and Park BU (2001). Cook's distance in local polynomial regression, Statistics and Probability Letters, 54, 33-40. https://doi.org/10.1016/S0167-7152(01)00031-1
  15. Kim C, Park BU, and Kim W (2002). Influence diagnostics in semiparametric regression models, Statistics and Probability Letters, 60, 49-58. https://doi.org/10.1016/S0167-7152(02)00268-7
  16. Kim C, Lee J, Yang H, and Bae W (2015). Case influence diagnostics in the lasso regression, Journal of the Korean Statistical Society, 44, 271-279. https://doi.org/10.1016/j.jkss.2014.09.003
  17. Silverman BW (1985). Density Estimation for Statistics and Data Analysis, Chapman and Hall, London.
  18. Tsai CL andWu X (1990). Diagnostics in transformation and weighted regression, Technometrics, 32, 315-322. https://doi.org/10.1080/00401706.1990.10484684
  19. Walker E and Birch JB (1988). Influence measures in ridge regression, Technometrics, 30, 221-227. https://doi.org/10.1080/00401706.1988.10488370

Cited by

  1. Effect of outliers on the variable selection by the regularized regression vol.25, pp.2, 2018, https://doi.org/10.29220/CSAM.2018.25.2.235