A sequential outlier detecting method using a clustering algorithm

군집 알고리즘을 이용한 순차적 이상치 탐지법

Seo, Han Son;Yoon, Min

  • Received : 2016.03.04
  • Accepted : 2016.04.16
  • Published : 2016.06.30


Outlier detection methods without performing a test often do not succeed in detecting multiple outliers because they are structurally vulnerable to a masking effect or a swamping effect. This paper considers testing procedures supplemented to a clustering-based method of identifying the group with a minority of the observations as outliers. One of general steps is performing a variety of t-test on individual outlier-candidates. This paper proposes a sequential procedure for searching for outliers by changing cutoff values on a cluster tree and performing a test on a set of outlier-candidates. The proposed method is illustrated and compared to existing methods by an example and Monte Carlo studies.


clustering;linear regression model;outlier test;sequential procedure


  1. Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272.
  2. Kianifard, F. and Swallow, W. H. (1989). Using recursive residuals, calculated on adaptive-ordered observations, to identify outliers in linear regression, Biometrics, 45, 571-585.
  3. Kianifard, F. and Swallow, W. H. (1996). A review of the development and application of recursive residuals in linear models, Journal of the American Statistical Association, 91, 391-400.
  4. Kim, S. S. and Krzanowski, W. J. (2007). Detecting multiple outliers in linear regression using a cluster method combined with graphical visualization, Computational Statistics, 22, 109-119.
  5. Mojena, R. (1977). Hierarchical grouping methods and stopping rules: an evaluation, The Computer Journal, 20, 359-363.
  6. Pena, D. and Yohai, V. J. (1995). The detection of influential subsets in linear regression by using an influence matrix, Journal of the Royal Statistical Society, Series B, 57, 145-156.
  7. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection, John Wiley, New York.
  8. Sebert, D. M., Montgomery, D. C., and Rollier, D. (1998). A clustering algorithm for identifying multiple outliers in linear regression, Computational Statistics and Data Analysis, 27, 461-484.
  9. Seo, H. S. and Yoon, M. (2014). A test on a specific set of outlier candidates in a linear model, The Korean Journal of Applied Statistics, 27, 307-315.

Cited by

  1. An on-line detection method for outliers of dynamic unstable measurement data pp.1573-7543, 2017,


Supported by : 건국대학교