A Criterion for the Selection of Principal Components in the Robust Principal Component Regression Kim, Bu-Yong;
Robust principal components regression is suggested to deal with both the multicollinearity and outlier problem. A main aspect of the robust principal components regression is the selection of an optimal set of principal components. Instead of the eigenvalue of the sample covariance matrix, a selection criterion is developed based on the condition index of the minimum volume ellipsoid estimator which is highly robust against leverage points. In addition, the least trimmed squares estimation is employed to cope with regression outliers. Monte Carlo simulation results indicate that the proposed criterion is superior to existing ones.
김부용, 신명희 (2010). 주성분회귀분석에서 주성분선정을 위한 새로운 방법, <응용통계연구>, 23, 967-975.
Fauconnier, C. and Haesbroeck, G. (2009). Outliers detections with the minimum covariance determinant estimator in practice, Statistical Methodology, 6, 363-379.
Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272.
Hubert, M. and Verboven, S. (2003). A robust PCR method for high-dimensional regressors, Journal of Chemometrics, 17, 438-452.
Jolliffe, I. T. (1972). Discarding variables in a principal component analysis. I: artificial data, Applied Statistics, 21, 160-173.
Karlis, D., Saporta, G. and Spinakis, A. (2003). A simple rule for the selection of principal components, Communications in Statistics-Theory and Methods, 32, 643-666.
Kim, B. Y. and Kim, H. Y. (2002). Hybrid algorithm for identification of regression outliers, The Korean Communications in Statistics, 9, 291-304.
Kim, B. Y. and Oh, M. H. (2004). Identification of regression outliers based on clustering of LMS-residual plots, The Korean Communications in Statistics, 11, 485-494.
Legendre, P. and Legendre, L. (1998). Numerical Ecology, Elsevier Science, Amsterdam.
Marden, J. I. (1999). Some robust estimates of principal components, Statistics & Probability Letters, 43, 349-359.
Marquardt, D. W. (1970). Generalized inverse, ridge regression, biased linear estimation, and nonlinear estimation, Technometrics, 12, 591-612.
Mason, R. L. and Gunst, R. F. (1985). Outlier-induced collinearities, Technometrics, 27, 401-407.
McKean, J. W., Sheather, S. J. and Hettmansperger, T. P. (1993). The use and interpretation of residuals based on robust estimation, Journal of the American Statistical Association, 88, 1254-1263.
Pidot, Jr., G. B. (1969). A principal components of the determinants of local government fiscal patterns, The Review of Economics and Statistics, 51, 176-188.
Rocke, D. M. and Woodruff, D. L. (1997). Robust estimation of multivariate location and shape, Journal of Statistical Planning and Inference, 57, 245-255.
Rousseeuw, P. J. (1984). Least median of squares regression, Journal of the American Statistical Association, 79, 871-880.
Rousseeuw, P. J. and Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator, Technometrics, 41, 212-223.
Rousseeuw, P. J. and Driessen, K. (2006). Computing LTS regression for large data sets, Data Mining and Knowledge Discovery, 12, 29-45.
Rousseeuw, P. J. and Leroy, A. M. (2003). Robust Regression and Outlier Detection, Wiley-Interscience.
Rousseeuw, P. J. and Zomeren, B. C. (1990). Unmasking multivariate outliers and leverage points, Journal of the American Statistical Association, 85, 633-639.
Woodruff, D. L. and Rocke, D. M. (1994). Computable robust estimation of multivariate location and shape in high dimension using compound estimators, Journal of the American Statistical Association, 89, 888-896.