Variable Selection in PLS Regression with Penalty Function

- Journal title : Communications for Statistical Applications and Methods
- Volume 15, Issue 4, 2008, pp.633-642
- Publisher : The Korean Statistical Society
- DOI : 10.5351/CKSS.2008.15.4.633

Title & Authors

Variable Selection in PLS Regression with Penalty Function

Park, Chong-Sun; Moon, Guy-Jong;

Park, Chong-Sun; Moon, Guy-Jong;

Abstract

Variable selection algorithm for partial least square regression using penalty function is proposed. We use the fact that usual partial least square regression problem can be expressed as a maximization problem with appropriate constraints and we will add penalty function to this maximization problem. Then simulated annealing algorithm can be used in searching for optimal solutions of above maximization problem with penalty functions added. The HARD penalty function would be suggested as the best in several aspects. Illustrations with real and simulated examples are provided.

Keywords

PLS regression;penalty function;variable selection;simulated annealing;

Language

Korean

References

1.

Aarts, E. and Korst, J. (1989). Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing, John Wiley & Sons, New York

2.

Abdi, H. (2003). Partial least squares(PLS) regression, In Lewis-Beck M., Bryman, A. and Futing, T. (eds.), Encyclopedia of Social Sciences Research Methods, Thousand Oaks (CA): Sage

3.

Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96, 1348-1360

4.

Frank, I. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools, Technometrics, 35, 109-135

5.

Gauchi, J. P. and Chagnon, P. (2001). Comparison of selection methods of explanatory variables in PLS regression with application to manufacturing process data, Chemometrics and Intelligent Laboratory Systems, 58, 171-193

7.

Hoskuldsson, A. (2001). Variable and subset selection in PLS regression, Chemometrics and Intelligent Laboratory Systems, 55, 23-38

8.

Jolliffe, I. T., Trendafilov, N. T. and Uddin, M. (2003). A modified principal component technique based on the lasso, Journal of Computational and Graphical Statistics, 12, 531-547

9.

Kirkpatrick, S., Gelatt, C. D. Jr. and Vecchi, M. P. (1983). Optimization by simulated annealing, Science, 220, 671-680

10.

Lazraq, A., Cleroux, R. and Gauchi, J. P. (2003). Selecting both latent and explanatory variables in the PLS1 regression model, Chemometrics and Intelligent Laboratory Systems, 66, 117-126

11.

Leardi, R. and Gonzealez, A. L. (1998). Genetic algorithms applied to feature selection in PLS regression: How and when to use them, Chemometrics and Intelligent Laboratory Systems, 41, 195-207

12.

Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equations of state calculations by fast computing machines, Journal of Chemical Physics, 21, 1087-1092

13.

Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions, Journal of the Royal Statistical Society, Series B, 36, 111-147

14.

Tibshirani, R. (1996). Regression shirinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B, 58, 267-288

15.

Wold, H. (1975). Path models with latent variables: The NIPALS approach, In H.M. Blalock et al., Quantitative Sociology: International Perspectives on Mathematical and Statistical Model Building, pages 307-357, Academic Press, New York