DOI QR코드

DOI QR Code

Introduction to variational Bayes for high-dimensional linear and logistic regression models

고차원 선형 및 로지스틱 회귀모형에 대한 변분 베이즈 방법 소개

  • Received : 2022.01.29
  • Accepted : 2022.03.09
  • Published : 2022.06.30

Abstract

In this paper, we introduce existing Bayesian methods for high-dimensional sparse regression models and compare their performance in various simulation scenarios. Especially, we focus on the variational Bayes approach proposed by Ray and Szabó (2021), which enables scalable and accurate Bayesian inference. Based on simulated data sets from sparse high-dimensional linear regression models, we compare the variational Bayes approach with other Bayesian and frequentist methods. To check the practical performance of the variational Bayes in logistic regression models, a real data analysis is conducted using leukemia data set.

본 논문에서는 고차원 희소 회귀분석을 위한 기존의 베이지안 방법들을 소개하고, 다양한 모의실험 세팅에서 성능을 비교한다. 특히, 확장 가능하고 정확한 베이지안 추론을 가능하게 하는 변분 베이즈 방법(variational Bayes method) (Ray와 Szabó, 2021) 에 중점을 둔다. 시뮬레이션 자료를 기반으로 한 희소 고차원 선형회귀분석을 실시하고 변분 베이즈 방법의 성능을 다른 베이지안 및 빈도론 방법들과 비교한다. 로지스틱 회귀분석에서 변분 베이즈 방법의 실제 성능을 확인하기 위해 백혈병 유전자 발현 자료를 사용하여 실자료 분석을 수행한다.

Keywords

References

  1. Carvalho C, Polson N, and Scott J (2010). The horseshoe estimator for sparse signals, Biometrika, 97, 465-480. https://doi.org/10.1093/biomet/asq017
  2. George E and McCulloch R (1993). Variable selection via Gibbs sampling, Journal of the American Statistical Association, 88, 881-889. https://doi.org/10.1080/01621459.1993.10476353
  3. Golub T, Slonim D, Tamayo P, Huard C, Gaasenbeek M, Mesirov J, Coller H, Loh M, Downing J, Caligiuri M, Bloomfield C, and Lander E (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, 286, 531-537. https://doi.org/10.1126/science.286.5439.531
  4. Ishwaran H and Rao J (2005). Spike and slab variable selection: frequentist and Bayesian strategies, The Annals of Statistics, 33, 730-773. https://doi.org/10.1214/009053604000001147
  5. Li L and Yao W (2018). Fully Bayesian logistic regression with hyper-LASSO priors for high-dimensional feature selection, Journal of Statistical Computation and Simulation, 88, 2827-2851. https://doi.org/10.1080/00949655.2018.1490418
  6. Makalic E and Schmidt D (2016). A Simple Sampler for the Horseshoe Estimator, IEEE Signal Processing Letters, 23, 179-182. https://doi.org/10.1109/LSP.2015.2503725
  7. McDermott P, Snyder J, and Willison R (2016). Methods for Bayesian Variable Selection with Binary Response Data using the EM Algorithm, arXiv: 1605.05429.
  8. Ray K, Szabo B, and Clara G (2020). Spike and slab variational Bayes for high dimensional logistic regression, Advances in Neural Information Processing Systems, 33, 14423-14434.
  9. Ray K and Szabo (2021). Variational Bayes for High-Dimensional Linear Regression With Sparse Priors, Journal of the American Statistical Association, To appear, 1-12.