DOI QR코드

DOI QR Code

Logistic Regression Classification by Principal Component Selection

  • Kim, Kiho (Department of Statistics, Hankuk University of Foreign Studies) ;
  • Lee, Seokho (Department of Statistics, Hankuk University of Foreign Studies)
  • Received : 2013.10.28
  • Accepted : 2013.12.31
  • Published : 2014.01.31

Abstract

We propose binary classification methods by modifying logistic regression classification. We use variable selection procedures instead of original variables to select the principal components. We describe the resulting classifiers and discuss their properties. The performance of our proposals are illustrated numerically and compared with other existing classification methods using synthetic and real datasets.

Keywords

References

  1. Alcala-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., Garcia, S., Sanchez, L. and Herrera, F. (2011). KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework, Journal of Multiple-Valued Logic and Soft Computing, 17, 255-287.
  2. Bache, K. and Lichman, M. (2013). UCI machine learning repository [http://archive.ics.uci.edu/ml] Irvine, CA: University of California, School of Information and Computer Science
  3. Barker, M. and Rayens, W. (2003). Partial least squares for discrimination, Journal of Chemometrics, 17, 166-173. https://doi.org/10.1002/cem.785
  4. Bishop, C. M. (2006). Pattern Recognition and Machine Learning, Springer.
  5. Fan, J. and Li, R. (2005). Variable selection via non concave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 98, 1348-1360.
  6. Friedman, J., Hastie, T. and Tibshirani, R. (2008). Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33, 1-22.
  7. Hastie, T., Buja, A. and Tibshirani, R. (1995). Penalized discriminant analysis, The Annals of Statistics, 23, 73-102. https://doi.org/10.1214/aos/1176324456
  8. Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Element of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition, Springer.
  9. Jolliffe, I. T. (2004). Principal Component Analysis, 2nd Edition, Springer.
  10. Kondylis, A. and Whittaker, J. (2008). Spectral preconditioning of Krylov spaces: combining pls and pc regression, Computational Statistics & Data Analysis, 52, 2588-2603. https://doi.org/10.1016/j.csda.2007.09.014
  11. Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective, The MIT Press.
  12. Patterson, N. J., Price, A. L. and Reich, D. (2006). Population structure and eigen-analysis. PLoS Genetics, 2:e190, doi:10.1371. https://doi.org/10.1371/journal.pgen.0020190
  13. Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A. and Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies, Nature Genetics, 38, 904-909. https://doi.org/10.1038/ng1847
  14. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B, 58, 267-288.

Cited by

  1. Principal Component Regression by Principal Component Selection vol.22, pp.2, 2015, https://doi.org/10.5351/CSAM.2015.22.2.173