DOI QR코드

DOI QR Code

Comparison of Lasso Type Estimators for High-Dimensional Data

  • Kim, Jaehee (Department of Statistics, Duksung Women's University)
  • Received : 2014.06.06
  • Accepted : 2014.07.14
  • Published : 2014.07.31

Abstract

This paper compares of lasso type estimators in various high-dimensional data situations with sparse parameters. Lasso, adaptive lasso, fused lasso and elastic net as lasso type estimators and ridge estimator are compared via simulation in linear models with correlated and uncorrelated covariates and binary regression models with correlated covariates and discrete covariates. Each method is shown to have advantages with different penalty conditions according to sparsity patterns of regression parameters. We applied the lasso type methods to Arabidopsis microarray gene expression data to find the strongly significant genes to distinguish two groups.

Keywords

References

  1. Benjamin, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing, Journal of Royal Statistical Society B, 57, 289-300.
  2. Buhlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data, Springer, New York.
  3. Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression, Annals of Statistics, 32, 407-451. https://doi.org/10.1214/009053604000000067
  4. Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap, Chapman and Hall, London.
  5. Friedman, J., Hasti, T. and Tibshirani, R. (2001). The Elements of Statistical Learning, Springer, New York.
  6. Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Applications to nonorthogonal problems, Technometrics, 12, 69-82. https://doi.org/10.1080/00401706.1970.10488635
  7. Kim, T. H., Hauser, F., Ha, T., Xue, S., Boehmer, M., Nishimura, N., Munemasa, S., Hubbard, K., Peine, N., Lee, B. H., Lee, S., Robert, N., Parker, J. E. and Schroeder, J. I. (2011). Chemical genetics reveals negative regulation of abscisic acid signaling by a plant immune response pathway, Current Biology, 21, 990-997. https://doi.org/10.1016/j.cub.2011.04.045
  8. Kyung, M., Gill, J., Ghosh, M. and Casella, G. (2010). Penalized regression, standard errors, and Bayesian lassos, Bayesian Analysis, 5, 369-412. https://doi.org/10.1214/10-BA607
  9. Park, T. and Casella, G. (2008). The Bayesian lasso, Journal of the American Statistical Association, 103, 681-686. https://doi.org/10.1198/016214508000000337
  10. Stein, C. (1981). Estimation of the mean of a multivariate normal distribution, The Annals of Statistics, 9, 1135-1151. https://doi.org/10.1214/aos/1176345632
  11. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of Royal Statiatical Society B, 58, 267-288.
  12. Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso, Journal of Royal Statistical Society B, 67, 91-108. https://doi.org/10.1111/j.1467-9868.2005.00490.x
  13. Zou, H. (2006). The adaptive lasso and its oracle properties, Journal of the American Statistical Association, 101, 1418-1429. https://doi.org/10.1198/016214506000000735
  14. Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net, Journal of Royal Statistical Society B, 67, 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x