DOI QR코드

DOI QR Code

High-dimensional linear discriminant analysis with moderately clipped LASSO

  • Chang, Jaeho (Department of Applied Statistics, Konkuk University) ;
  • Moon, Haeseong (Department of Applied Statistics, Konkuk University) ;
  • Kwon, Sunghoon (Department of Applied Statistics, Konkuk University)
  • Received : 2020.08.18
  • Accepted : 2020.11.19
  • Published : 2021.01.31

Abstract

There is a direct connection between linear discriminant analysis (LDA) and linear regression since the direction vector of the LDA can be obtained by the least square estimation. The connection motivates the penalized LDA when the model is high-dimensional where the number of predictive variables is larger than the sample size. In this paper, we study the penalized LDA for a class of penalties, called the moderately clipped LASSO (MCL), which interpolates between the least absolute shrinkage and selection operator (LASSO) and minimax concave penalty. We prove that the MCL penalized LDA correctly identifies the sparsity of the Bayes direction vector with probability tending to one, which is supported by better finite sample performance than LASSO based on concrete numerical studies.

Keywords

References

  1. Bickel PJ and Levina E (2004). Some theory for fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations, Bernoulli, 10, 989-1010.
  2. Burczynski ME, Peterson RL, Twine NC, et al. (2006). Molecular classification of Crohn's disease and ulcerative colitis patients using transcriptional profiles in peripheral blood mononuclear cells, The Journal of Molecular Diagnostics, 8,51-61. https://doi.org/10.2353/jmoldx.2006.050079
  3. Cai T and Liu W (2011). A direct estimation approach to sparse linear discriminant analysis, Journal of the American Statistical Association, 106, 1566-1577. https://doi.org/10.1198/jasa.2011.tm11199
  4. Casella G (1985). An introduction to empirical bayes data analysis, The American Statistician, 39, 83-87. https://doi.org/10.2307/2682801
  5. Chin K, DeVries S, Fridlyand J, et al. (2006). Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell, 10, 529-541. https://doi.org/10.1016/j.ccr.2006.10.009
  6. Chowdary D, Lathrop J, Skelton J, et al. (2006). Prognostic gene expression signatures can be measured in tissues collected in RNAlater preservative, The Journal of Molecular Diagnostics, 8, 31-39. https://doi.org/10.2353/jmoldx.2006.050056
  7. Clemmensen L, Hastie T, Witten D, and Ersboll B (2011). Sparse discriminant analysis, Technometrics, 53, 406-413. https://doi.org/10.1198/TECH.2011.08118
  8. Efron B and Morris C (1975). Data analysis using stein's estimator and its generalizations, Journal of the American Statistical Association, 70, 311-319. https://doi.org/10.1080/01621459.1975.10479864
  9. Fan J and Fan Y (2008). High dimensional classification using features annealed independence rules, Annals of Statistics, 36, 2605-2637. https://doi.org/10.1214/07-AOS504
  10. Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96, 1348-1360. https://doi.org/10.1198/016214501753382273
  11. Fan J and Song R (2010). Sure independence screening in generalized linear models with np-dimensionality, The Annals of Statistics, 38, 3567-3604. https://doi.org/10.1214/10-AOS798
  12. Fan J, Xue L, and Zou H (2014). Strong oracle optimality of folded concave penalized estimation, Annals of Statistics, 42, 819. https://doi.org/10.1214/13-AOS1198
  13. Fisher RA (1936). The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7, 179-188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
  14. Gordon GJ, Jensen RV, Hsiao LL, et al. (2002). Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma, Cancer Research, 62, 4963-4967.
  15. Guo Y, Hastie T, and Tibshirani R (2006). Regularized linear discriminant analysis and its application in microarrays, Biostatistics, 8, 86-100. https://doi.org/10.1093/biostatistics/kxj035
  16. Hastie T, Tibshirani R, and Friedman J (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Science & Business Media.
  17. John AR (2016). datamicroarray: Collection of Data Sets for Classification. Available from: https://github.com/ramhiser/datamicroarray.
  18. Kim D, Lee S, and Kwon S (2020). A unified algorithm for the non-convex penalized estimation: The ncpen package, The R Journal, Accepted.
  19. Kim Y, Choi H, and Oh HS (2008). Smoothly clipped absolute deviation on high dimensions, Journal of the American Statistical Association, 103, 1665-1673. https://doi.org/10.1198/016214508000001066
  20. Kim Y, Jeon JJ, and Han S (2016). A necessary condition for the strong oracle property, Scandinavian Journal of Statistics, 43, 610-624. https://doi.org/10.1111/sjos.12195
  21. Kim Y and Kwon S (2012). Global optimality of nonconvex penalized estimators, Biometrika, 99, 315-325. https://doi.org/10.1093/biomet/asr084
  22. Krzanowski W, Jonathan P, McCarthy W, and Thomas M (1995). Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data, Journal of the Royal Statistical Society: Series C (Applied Statistics), 44, 101-115.
  23. Kwon S, Lee S, and Kim Y (2015). Moderately clipped lasso, Computational Statistics & Data Analysis, 92, 53-67. https://doi.org/10.1016/j.csda.2015.07.001
  24. Mai Q, Zou H, and Yuan M (2012). A direct approach to sparse discriminant analysis in ultra-high dimensions, Biometrika, 99, 29-42. https://doi.org/10.1093/biomet/asr066
  25. Mazumder R, Friedman JH, and Hastie T (2011). Sparsenet: Coordinate descent with nonconvex penalties, Journal of the American Statistical Association, 106, 1125-1138. https://doi.org/10.1198/jasa.2011.tm09738
  26. Shao J, Wang Y, Deng X, and Wang, S. (2011). Sparse linear discriminant analysis by thresholding for high dimensional data, The Annals of Statistics, 39, 1241-1265. https://doi.org/10.1214/10-AOS870
  27. Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  28. Witten DM and Tibshirani R (2011). Penalized classification using fisher's linear discriminant, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 753-772. https://doi.org/10.1111/j.1467-9868.2011.00783.x
  29. Wu MC, Zhang L, Wang Z, Christiani DC, and Lin X (2009). Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection, Bioinformatics, 25, 1145-1151. https://doi.org/10.1093/bioinformatics/btp019
  30. Yuille AL and Rangarajan A (2002). The concave-convex procedure (cccp). In Advances in Neural Information Processing Systems, pages 1033-1040.
  31. Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38, 894-942. https://doi.org/10.1214/09-AOS729
  32. Zhang CH and Huang J (2008). The sparsity and bias of the lasso selection in high-dimensional linear regression, The Annals of Statistics, 36, 1567-1594. https://doi.org/10.1214/07-AOS520
  33. Zhao P and Yu B (2006). On model selection consistency of lasso, Journal of Machine Learning Research, 7, 2541-2563.
  34. Zou H (2006). The adaptive lasso and its oracle properties, Journal of the American Statistical Association, 101, 1418-1429. https://doi.org/10.1198/016214506000000735