Advanced SearchSearch Tips
Clustering Observations for Detecting Multiple Outliers in Regression Models
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Clustering Observations for Detecting Multiple Outliers in Regression Models
Seo, Han-Son; Yoon, Min;
  PDF(new window)
Detecting outliers in a linear regression model eventually fails when similar observations are classified differently in a sequential process. In such circumstances, identifying clusters and applying certain methods to the clustered data can prevent a failure to detect outliers and is computationally efficient due to the reduction of data. In this paper, we suggest to implement a clustering procedure for this purpose and provide examples that illustrate the suggested procedure applied to the Hadi-Simonoff (1993) method, reverse Hadi-Simonoff method, and Gentleman-Wilk (1975) method.
Clustering;linear regression model;outliers;regression diagnostics;
 Cited by
Ahn, B. J. and Seo, H. S. (2011). Outlier detection using dynamic plots, The Korean Journal of Applied Statistics, 24, 979-986. crossref(new window)

Atkinson, A. C. (1994). Fast very robust methods for the detection of multiple outliers, Journal of the American Statistical Association, 89, 1329-1339. crossref(new window)

Atkinson, A. C., Riani, M. and Cerioli, A. (2004). Exploring Multivariate Data with The Forward Search, Springer, New York.

Cormack, R. M. (1971). A review of classification, Journal of the Royal Statistical Society, Series A, 134, 321-367. crossref(new window)

Gentleman, J. F. and Wilk, M. B. (1975). Detecting outliers.II. supplementing the direct analysis of residuals, Biometrics, 31, 387-410. crossref(new window)

Gray, J. B. and Ling, R. F. (1984). K-clustering as a detection tool for influential subsets in regression, Technometrics, 26, 305-318.

Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272. crossref(new window)

Jajo, N. K. (2005). A review of Robust regression an diagnostic procedures in linear regression, Acta Mathematicae Applicatae Sinica, 21, 209-224. crossref(new window)

Kaufman, L. and Rousseeuw, P. J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, New York.

Kianifard, F. and Swallow, W. H. (1989). Using recursive residuals, calculated on adaptively-ordered observations, to identify outliers in linear regression, Biometrics, 45, 571-585. crossref(new window)

Kianifard, F. and Swallow, W. H. (1990). A Monte Carlo comparison of five procedures for identifying outliers in linear regression, Communications in Statistics, 19, 1913-1938. crossref(new window)

Ling, R. F. (1972). On the theory and construction of k-clusters, Computer Journal, 15, 326-332. crossref(new window)

Marasinghe, M. G. (1985). A multistage procedure for detecting several outliers in linear regression, Technometrics, 27, 395-399. crossref(new window)

Paul, S. R. and Fung, K. Y. (1991). A generalized extreme studentized residual multiple-outlier-detection procedure in linear regression, Technometrics, 33, 339-348. crossref(new window)

Pena, D. and Yohai, V. J. (1999). A fast procedure for outlier diagnostics in linear regression problems, Journal of the American Statistical Association, 94, 434-445.

Rousseeuw, P. J. (1984). Least median of squares regression, Journal of the American Statistical Association, 79, 871-880. crossref(new window)