DOI QR코드

DOI QR Code

Least quantile squares method for the detection of outliers

  • Seo, Han Son (Department of Applied Statistics, Konkuk University) ;
  • Yoon, Min (Department of Applied Mathematics, Pukyong National University)
  • Received : 2020.10.07
  • Accepted : 2020.11.09
  • Published : 2021.01.31

Abstract

k-least quantile of squares (k-LQS) estimates are a generalization of least median of squares (LMS) estimates. They have not been used as much as LMS because their breakdown points become small as k increases. But if the size of outliers is assumed to be fixed LQS estimates yield a good fit to the majority of data and residuals calculated from LQS estimates can be a reliable tool to detect outliers. We propose to use LQS estimates for separating a clean set from the data in the context of outlyingness of the cases. Three procedures are suggested for the identification of outliers using LQS estimates. Examples are provided to illustrate the methods. A Monte Carlo study show that proposed methods are effective.

Keywords

References

  1. Atkinson AC (1994). Fast very robust methods for the detection of multiple outliers, Journal of the American Statistical Association, 89, 1329-1339. https://doi.org/10.1080/01621459.1994.10476872
  2. Carrizosa E and Plastria F (1995). The determination of a least quantile of squares regression line for all quantiles, Computational Statistics & Data Analysis, 20, 467-479. https://doi.org/10.1016/0167-9473(94)00059-R
  3. Hadi AS and Simonoff JS (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272. https://doi.org/10.1080/01621459.1993.10476407
  4. Hartigan JA (1981). Consistency of single linkage for high-density clusters, Journal of the American Statistical Association, 76, 388-394. https://doi.org/10.1080/01621459.1981.10477658
  5. Hawkins DM, Bradu D, and Kass GV (1984). Location of several outliers in multiple regression data using elemental sets, Technometrics, 26, 197-208. https://doi.org/10.1080/00401706.1984.10487956
  6. Jajo Nethal K (2005). A review of robust regression and diagnostic procedures in linear regression, Acta Mathematicae Applicatae Sinica, 21, 209-224. https://doi.org/10.1007/s10255-005-0230-2
  7. Kianifard F and Swallow WH (1989). Using recursive residuals, calculated on adaptive-ordered observations, to identify outliers in linear regression, Biometrics, 45, 571-885. https://doi.org/10.2307/2531498
  8. Kianifard F and Swallow WH (1990). A Monte Carlo comparison of five procedures for identifying outliers in linear regression, Communications in Statistics - Theory and Methods, 19, 1913-1938. https://doi.org/10.1080/03610929008830300
  9. Marasinghe MG (1985). A multistage procedure for detecting several outliers in linear regression, Technometrics, 27, 395-399. https://doi.org/10.1080/00401706.1985.10488078
  10. Paul SR and Fung KY (1991). A generalized extreme studentized residual multiple-outlier-detection procedure in linear regression, Technometrics, 33, 339-348. https://doi.org/10.1080/00401706.1991.10484839
  11. Pena D and Yohai VJ (1999). A fast procedure for outlier diagnostics in linear regression problems, Journal of the American Statistical Association, 94, 434-445. https://doi.org/10.2307/2670164
  12. Rosner B (1975). On the detection of many outliers, Technometrics, 17, 217-227.
  13. Rousseeuw PJ (1984). Least median of squares regression, Journal of the American Statistical Association, 79, 871-880. https://doi.org/10.1080/01621459.1984.10477105
  14. Rousseeuw PJ and Leroy AM (1987). Robust Regression and Outlier Detection, John Wiley & Sons, New York.
  15. Simonoff JS (1988). Detecting outlying cells in two-way contingency tables via backwards-stepping, Technometrics, 30, 339-345. https://doi.org/10.1080/00401706.1988.10488407
  16. Stromberg AJ (1993). Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression, SIAM Journal on Scientific Computing, 14, 1289-1299. https://doi.org/10.1137/0914076
  17. Watson GA (1998). On computing the least quantile of squares estimate, SIAM Journal on Scientific Computing, 19, 1125-1138. https://doi.org/10.1137/S1064827595283768
  18. Yohai VJ (1987). High breakdown-point and high efficiency robust estimates for regression, Annals of Statistics, 15, 642-656. https://doi.org/10.1214/aos/1176350366