Procedure for the Selection of Principal Components in Principal Components Regression

- Journal title : Korean Journal of Applied Statistics
- Volume 23, Issue 5, 2010, pp.967-975
- Publisher : The Korean Statistical Society
- DOI : 10.5351/KJAS.2010.23.5.967

Title & Authors

Procedure for the Selection of Principal Components in Principal Components Regression

Kim, Bu-Yong; Shin, Myung-Hee;

Kim, Bu-Yong; Shin, Myung-Hee;

Abstract

Since the least squares estimation is not appropriate when multicollinearity exists among the regressors of the linear regression model, the principal components regression is used to deal with the multicollinearity problem. This article suggests a new procedure for the selection of suitable principal components. The procedure is based on the condition index instead of the eigenvalue. The principal components corresponding to the indices are removed from the model if any condition indices are larger than the upper limit of the cutoff value. On the other hand, the corresponding principal components are included if any condition indices are smaller than the lower limit. The forward inclusion method is employed to select proper principal components if any condition indices are between the upper limit and the lower limit. The limits are obtained from the linear model which is constructed on the basis of the conjoint analysis. The procedure is evaluated by Monte Carlo simulation in terms of the mean square error of estimator. The simulation results indicate that the proposed procedure is superior to the existing methods.

Keywords

Data mining;multicollinearity;principal components regression;condition index;selection of principal components;

Language

Korean

Cited by

References

1.

Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics, John Wiley.

2.

Hadi, A. S. and Ling, R. F. (1998). Some cautionary notes on the use of principle components regression, The American Statistician, 52, 15-19.

3.

Jolliffe, I. T. (1972). Discarding variables in a principal component analysis. I: artificial data, Applied Statistics, 21, 160-1733.

4.

Jolliffe, I. T. (1982). A note on the use of principal component in regression, Applied Statistics, 31, 300-303.

5.

Mansfield, E. R., Webster, J. T. and Gunst, R. F. (1977). An analytic variable selection technique for principal component regression, Applied Statistics, 26, 34-40.

6.

Marquardt, D. W. (1970). Generalized inverse, ridge regression, biased linear estimation, and nonlinear estimation, Technometrics, 12, 591-612.

7.

Marquardt, D. W. and Snee, R. D. (1975). Ridge regression in practice, The American Statistician, 29, 3-20.

8.

Mason, R. L. and Gunst, R. F. (1985). Selecting principal components in regression, Statistics & Probability Letters, 3, 299-301.

9.

Montgomery, D. C., Peck, E. A. and Vining, G. G. (2006). Introduction to Linear Regression Analysis, John Wiley & Sons, Inc.