BCDR algorithm for network estimation based on pseudo-likelihood with parallelization using GPU

Kim, Byungsoo;Yu, Donghyeon;

doi:10.7465/jkdi.2016.27.2.381

Journal of the Korean Data and Information Science Society

Volume 27 Issue 2
/
Pages.381-394
/
2016
/
1598-9402(pISSN)

The Korean Data and Information Science Society (한국데이터정보과학회)

DOI QR Code

BCDR algorithm for network estimation based on pseudo-likelihood with parallelization using GPU

유사가능도 기반의 네트워크 추정 모형에 대한 GPU 병렬화 BCDR 알고리즘

Kim, Byungsoo (Department of Statistics, Yeungnam University) ;
Yu, Donghyeon (Department of Statistics, Keimyung University)

김병수 (영남대학교 통계학과) ;
유동현 (계명대학교 통계학과)

Received : 2016.02.25
Accepted : 2016.03.23
Published : 2016.03.31

https://doi.org/10.7465/jkdi.2016.27.2.381 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Graphical model represents conditional dependencies between variables as a graph with nodes and edges. It is widely used in various fields including physics, economics, and biology to describe complex association. Conditional dependencies can be estimated from a inverse covariance matrix, where zero off-diagonal elements denote conditional independence of corresponding variables. This paper proposes a efficient BCDR (block coordinate descent with random permutation) algorithm using graphics processing units and random permutation for the CONCORD (convex correlation selection method) based on the BCD (block coordinate descent) algorithm, which estimates a inverse covariance matrix based on pseudo-likelihood. We conduct numerical studies for two network structures to demonstrate the efficiency of the proposed algorithm for the CONCORD in terms of computation times.

그래피컬 모형은 변수들 사이의 조건부 종속성을 노드와 연결선을 통하여 그래프로 나타낸다. 변수들 사이의 복잡한 연관성을 표현하기 위하여 그래피컬 모형은 물리학, 경제학, 생물학을 포함하여 다양한 분야에 적용되고 있다. 조건부 종속성은 공분산 행렬의 역행렬의 비대각 성분이 0인 것과 대응하는 두 변수의 조건부 독립이 동치임에 기반하여 공분산 행렬의 역행렬로부터 추정될 수 있다. 본 논문은 공분산 행렬의 역행렬을 희박하게 추정하는 유사가능도 기반의 CONCORD (convex correlation selection method) 방법에 대하여 기존의 BCD (block coordinate descent) 알고리즘을 랜덤 치환을 활용한 갱신 규칙과 그래픽 처리 장치 (graphics processing unit)의 병렬 연산을 활용하여 고차원 자료에 대하여 보다 효율적인 BCDR (block coordinate descent with random permutation) 알고리즘을 제안하였다. 두 종류의 네트워크 구조를 고려한 모의실험에서 제안하는 알고리즘의 효율성을 수렴까지의 계산 시간을 비교하여 확인하였다.

Keywords

References

Banerjee, O., Ghaoui, L. E. and d'Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. Journal of Machine Learning Research, 9, 485-516.
Barabasi, A. and Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509-512. https://doi.org/10.1126/science.286.5439.509
Boyd, S. and Vandenberghe, L. (2004). Convex optimization, Cambridge University Press, New York.
Cai, T., Liu, W. D. and Luo, X. (2011). A constrained ${\ell}_1$ minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106, 594-607. https://doi.org/10.1198/jasa.2011.tm10155
Candes, E. J. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics, 35, 2313-2351. https://doi.org/10.1214/009053606000001523
Candes, E. J. and Plan, Y. (2011). A probabilistic and RIPless theory of compressed sensing. Information Theory, IEEE Transactions, 57, 7235-7254. https://doi.org/10.1109/TIT.2011.2161794
Dong, H., Luo, L., Hong, S., Siu, H., Xiao, Y., Jin, L., Chen, R. and Xiong, M. (2010). Integrated analysis of mutations, miRNA and mRNA expression in glioblastoma. BMC Systems Biology, 4, 1-20. https://doi.org/10.1186/1752-0509-4-1
Drton, M. and Perlman, M. D. (2004). Model selection for Gaussian concentration graphs. Biometrika, 91, 591-602. https://doi.org/10.1093/biomet/91.3.591
Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9, 432-441. https://doi.org/10.1093/biostatistics/kxm045
Fu, W. (1998). Penalized regressions: The bridge vs the lasso. Journal of Computational and Graphical Statistics, 7, 397-416.
Khare, K., Oh, S.-Y. and Rajaratnam, B. (2015). A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees. Journal of the Royal Statistical Society B, 77, 803-825. https://doi.org/10.1111/rssb.12088
Kwon, S., Han, S. and Lee, S. (2013). A small review and further studies on the LASSO. Journal of the Korean Data & Information Science Society, 24, 1077-1088. https://doi.org/10.7465/jkdi.2013.24.5.1077
Lauritzen, S. (1996). Graphical Models. Oxford Unversity Press Inc., New York.
Meinshausen, N. and Buhlmann, P. (2006). High-dimensional graph and variable selection with the lasso. Annals of Statistics, 34, 1436-1462. https://doi.org/10.1214/009053606000000281
Nesterov, Y. (2012). Efficiency of coordinate descent methods on huge-scale optimizatioin problems. SIAM Journal on Optimization, 22, 341-362. https://doi.org/10.1137/100802001
Pang, H., Liu, H. and Vanderbei, R. (2014). The FASTCLIME package for linear programming and largescale precision matrix estimation in R. Journal of Machine Learning Research, 15, 489-493.
Peng, J., Wang, P., Zhou, N. and Zhu, J. (2009). Partial correlation estimation by Joint sparse regression models. Journal of the American Statistical Association, 104, 735-746. https://doi.org/10.1198/jasa.2009.0126
Shalev-Shwartz, S. and Tewari, A. (2011). Stochastic Methods for ℓ1-regularized loss minimization. Journal of Machine Learning Research, 12, 1865-1892.
Tang, H., Xiao, G., Behrens, C., Schiller, J., Allen, J., Chow, C. W., Suraokar, M., Corvalan, A., Mao, J., White, M. A., Wistuba, I. I., Minna, J. D. and Xie, Y. (2013). A 12-gene set predicts survival benefits from adjuvant chemotherapy in non-small cell lung cancer patients. Clinical Cancer Research. 19, 1577-1586. https://doi.org/10.1158/1078-0432.CCR-12-2321
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B, 58, 267-288.
Vandenberghe, L., Boyd, S. and Wu, S. P. (1998). Determinant maximization with linear matrix inequality constraints. SIAM Journal on Matrix Analysis and Applications, 19, 499-533. https://doi.org/10.1137/S0895479896303430
Witten, D., Friedman, J. and Simon, N. (2011). New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, 20, 892-900. https://doi.org/10.1198/jcgs.2011.11051a
Yu, D. and Lim, J. (2013). Introduction to general purpose GPU computing. Journal of the Korean Data & Information Science Society, 24, 1043-1061. https://doi.org/10.7465/jkdi.2013.24.5.1043
Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika, 94, 19-35. https://doi.org/10.1093/biomet/asm018

Journal of the Korean Data and Information Science Society

BCDR algorithm for network estimation based on pseudo-likelihood with parallelization using GPU

유사가능도 기반의 네트워크 추정 모형에 대한 GPU 병렬화 BCDR 알고리즘

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)