Outlier Identification in Regression Analysis using Projection Pursuit

  • Kim, Hyojung (Department of Statistics, Sungkyunkwan University) ;
  • Park, Chongsun (Department of Statistics, Sungkyunkwan University)
  • Published : 2000.12.01

Abstract

In this paper, we propose a method to identify multiple outliers in regression analysis with only assumption of smoothness on the regression function. Our method uses single-linkage clustering algorithm and Projection Pursuit Regression (PPR). It was compared with existing methods using several simulated and real examples and turned out to be very useful in regression problem with the regression function which is far from linear.

References

  1. Technometrics v.16 A Robust Method for Multiple Linear Regression Andrews, D.F.
  2. Outliers in Statistical Data Barnatt, V.;Lewis, T.
  3. Biometrika v.77 Outlier Tests for Logistic Regression : A Conditional Approach Bedrick, E.J.;Hill, J.R.
  4. Smaple Size Requirements for Multiple Outlier Location Techniques Based on Elemental Sets Bradu, D.;Hawkins, D.M.
  5. Journal of American Statistical Association v.74 Locally Weighted Regression and Smmothing Scatterplots Cleveland, W.S.
  6. Journal of the American Statistical Association v.84 Generalizing Logistic Regression by Nonparametric Mixing Follmann, D.A.;Lambert, D.
  7. Journal of American Statistical Association v.76 Projection Pursuit Regression Friedman, J.H.;Stuetzle, W.
  8. Biometrika v.73 Residual Variance and Residual Pattern in Nonlinear Regression and for the Detection of Outlier Gasser, T.;Sroka, L.;Jennen Steinmetz, C.
  9. Biometrics v.31 Detecting Outliers : Ⅱ Supplementing the Direct Analysis of Residuals Gentleman, J.F.;Wilk, M.B.
  10. Journal of American Statistical Association v.75 Procedures for the Identification of Multiple Outliers in Linear Models Hadi, A.S.;Simonoff, J.S.
  11. Smoothing Techniques With Implementation in S. Hardle, W.
  12. Scandinavian Journal of Statistics-Theory and Applications v.12 On Robust Kernel Estimation of Derivatives of Regression Functions Hardle, W.;Gasser, T.
  13. Genetalized Additive Models Hastie, T.;Tibshirani, R.
  14. Indentification of Outlers Hawkins, D.M.
  15. Technometrics v.26 Location of Several Ourliers in Multiple-Regression Data Using Elemental Sets Hawkins, D.M.;Bradu, D.;Kass, G.V.
  16. Journal of American Statistical Association v.81 Outliers and Residual Distributions in Logistic Regression Jennings, D.E.
  17. Journal of American Statistical Association v.77 Efficient Bounded-Influence Regression Estimation Krasker, W.S.;Welsch, R.E.
  18. Economic Review v.3 Adaptive Estimation of Nonlinear Regression Models Manski, C.F.
  19. Technometrics v.27 A Multistage Procedure for Detecting Several Outliears in Linear Regression Marasinghe, M.G.
  20. Communications in Statistics-Theory and Methods v.15 Detection of Multivariate Outliers in Linear Mixed Models Naes, T.
  21. Technometrics v.33 A Generalized Extreme Studentized Residual Multiple Outlier Detection Procedure in Linear Regression Paul, S.R.;Fung, K.Y.
  22. Journal of American Statistical Association v.79 Least Median of Squares Regression Rousseeuw, P.J.
  23. Computational Statistics & Data Analysis v.27 A Clustering Algorithm for Identifying Multiple Outliers in Linear Regression Sebert, D.M.;Montgomery, D.C.;Rollier, D.A.
  24. Diabetes Mellitus in Children, Diabet v.30 Ractors Affecting and Patterns of Residual Insulin Secretion During the First Year of Type Ⅰ(Insulin Dependent) Sockett, E.B.;Daneman, D.;Clarson, C.;Erich, R.M.