DOI QR코드

DOI QR Code

MP-Lasso chart: a multi-level polar chart for visualizing group Lasso analysis of genomic data

  • Min Song (Department of Statistics, Korea University) ;
  • Minhyuk Lee (Department of Statistics, Korea University) ;
  • Taesung Park (Department of Statistics, Seoul National University) ;
  • Mira Park (Department of Preventive Medicine, Eulji University)
  • 투고 : 2022.12.13
  • 심사 : 2022.12.19
  • 발행 : 2022.12.31

초록

Penalized regression has been widely used in genome-wide association studies for joint analyses to find genetic associations. Among penalized regression models, the least absolute shrinkage and selection operator (Lasso) method effectively removes some coefficients from the model by shrinking them to zero. To handle group structures, such as genes and pathways, several modified Lasso penalties have been proposed, including group Lasso and sparse group Lasso. Group Lasso ensures sparsity at the level of pre-defined groups, eliminating unimportant groups. Sparse group Lasso performs group selection as in group Lasso, but also performs individual selection as in Lasso. While these sparse methods are useful in high-dimensional genetic studies, interpreting the results with many groups and coefficients is not straightforward. Lasso's results are often expressed as trace plots of regression coefficients. However, few studies have explored the systematic visualization of group information. In this study, we propose a multi-level polar Lasso (MP-Lasso) chart, which can effectively represent the results from group Lasso and sparse group Lasso analyses. An R package to draw MP-Lasso charts was developed. Through a real-world genetic data application, we demonstrated that our MP-Lasso chart package effectively visualizes the results of Lasso, group Lasso, and sparse group Lasso.

키워드

과제정보

This research was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (NRF-2021R1A2C1007788).

참고문헌

  1. Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc Series B Methodol 1996;58:267-288.
  2. Wu TT, Chen YF, Hastie T, Sobel E, Lange K. Genome-wide association analysis by Lasso penalized logistic regression. Bioinformatics 2009;25:714-721. https://doi.org/10.1093/bioinformatics/btp041
  3. Ogutu JO, Piepho HP. Regularized group regression methods for genomic prediction: Bridge, MCP, SCAD, group bridge, group Lasso, sparse group Lasso, group MCP and group SCAD. BMC Proc 2014;8:S7.
  4. Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J R Stat Soc Series B Stat Methodol 2006;68:49-67. https://doi.org/10.1111/j.1467-9868.2005.00532.x
  5. Simon N, Friedman J, Hastie T, Tibshirani R. A sparse-group Lasso. J Comput Graph Stat 2013;22:231-245. https://doi.org/10.1080/10618600.2012.681250
  6. Zhao P, Rocha G, Yu B. The composite absolute penalties family for grouped and hierarchical variable selection. Ann Stat 2009;37:3468-3497.
  7. Wang H, Leng C. A note on adaptive group Lasso. Comput Stat Data Anal 2008;52:5277-5286. https://doi.org/10.1016/j.csda.2008.05.006
  8. Park M, Kim D, Moon K, Park T. Integrative analysis of multi-omics data based on blockwise sparse principal components. Int J Mol Sci 2020;21:8202.
  9. Friedman J, Hastie T, Tibshirani R. A note on the group Lasso and a sparse group Lasso. Preprint at https://doi.org/10.48550/ arXiv.1001.0736 (2010).
  10. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010;33:1-22.
  11. Yang Y, Zou H. Package 'ggLasso': group Lasso penalized learning using a unified BMD algorithm. Vienna: R Project for Statistical Computing, 2020.
  12. Simon F, Friedman J, Hastie T, Tibshirani R. Package 'SGL': fit a GLM (or Cox Model) with a combination of Lasso and group Lasso regularization. R package version 1.3. Vienna: R Project for Statistical Computing, 2019.
  13. Li X. ALL: a data package. Vienna: R Project for Statistical Computing, 2022.