ppcor: An R Package for a Fast Calculation to Semi-partial Correlation Coefficients

- Journal title : Communications for Statistical Applications and Methods
- Volume 22, Issue 6, 2015, pp.665-674
- Publisher : The Korean Statistical Society
- DOI : 10.5351/CSAM.2015.22.6.665

Title & Authors

ppcor: An R Package for a Fast Calculation to Semi-partial Correlation Coefficients

Kim, Seongho;

Kim, Seongho;

Abstract

Lack of a general matrix formula hampers implementation of the semi-partial correlation, also known as part correlation, to the higher-order coefficient. This is because the higher-order semi-partial correlation calculation using a recursive formula requires an enormous number of recursive calculations to obtain the correlation coefficients. To resolve this difficulty, we derive a general matrix formula of the semi-partial correlation for fast computation. The semi-partial correlations are then implemented on an R package ppcor along with the partial correlation. Owing to the general matrix formulas, users can readily calculate the coefficients of both partial and semi-partial correlations without computational burden. The package ppcor further provides users with the level of the statistical significance with its test statistic.

Keywords

correlation;partial correlation;part correlation;ppcor;semi-partial correlation;

Language

English

Cited by

2.

3.

4.

5.

7.

References

1.

Abdi, H. (2007). Kendall rank correlation, In N.J. Salkind (Ed.), Encyclopedia of Measurement and Statistics, Thousand Oaks (CA), Sage, 508-510.

2.

Baum, E. S. and Rude, S. S. (2013). Acceptance-enhanced expressive writing prevents symptoms in participants with low initial depression, Cognitive Therapy and Research, 37, 35-42.

3.

Castelo, R. and Roverato, A. (2006). A robust procedure for Gaussian graphical model search from microarray data with p larger than n, Journal of Machine Learning Research, 7, 2621-2650.

4.

Drummond, D. A., Raval, A. and Wilke, C. O. (2006). A single determinant dominates the rate of yeast protein evolution, Molecular Biology and Evolution, 23, 327-337.

5.

Fang, X. Z., Luo, L., Reveille, J. D. and Xiong, M. (2009). Discussion: Why do we test multiple traits in genetic association studies?, Journal of the Korean Statistical Society, 38, 17-23.

6.

Fox, J. (2005). The R Commander: A basic-statistics graphical user interface to R, Journal of Statistical Software, 14, 1-42.

7.

James, S. (2002). Applied Multivariate Statistics for the Social Sciences, Lawrence Erlbaum Associates, Inc., Mahwah, NJ.

8.

Johnson, R. A. and Wichern, D. W. (2002). Applied Multivariate Statistical Analysis, Prentice Hall.

9.

Kim, S., Koo, I., Jeong, J., Wu, S., Shi, X. and Zhang, X. (2012). Compound identification using partial and semipartial correlations for gas chromatography-mass spectrometry data, Analytical Chemistry, 12, 6477-6487.

10.

Kim, S. and Yi, S. (2007). Understanding relationship between sequence and functional evolution in yeast proteins, Genetica, 131, 151-156.

11.

Kim, S. and Zhang, X. (2013). Comparative analysis of mass spectral similarity measures on peak alignment for comprehensive two-dimensional gas chromatography mass spectrometry, Computational and Mathematical Methods in Medicine, 2013, 509761.

12.

Kramer, N., Schafer, J. and Boulesteix, A. L. (2009). Regularized estimation of large scale gene association networks using Gaussian graphical models, BMC Bioinformatics, 10, 384.

14.

Peng, J., Wang, P., Zhou, N. and Zhu, J. (2009). Partial correlation estimation by joint sparse regression models, Journal of the American Statistical Association, 104, 735-746.

15.

Penrose, R. (1995). A generalized inverse for matrices, In Proceedings of the Cambridge Philosophical Society, 51, 406-413.

16.

R Development Core Team (2015). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL: http://www.R-project.org/

17.

Schafer, J. and Strimmer, K. (2005a). A shrinkage approach to large-scale covariance matrix estimation and implications for functional Genomics, Statistical Applications in Genetics and Molecular Biology, 4, 32.

18.

Schafer, J. and Strimmer, K. (2005b). An empirical Bayes approach to inferring large-scale gene association networks, Bioinformatics, 21, 754-764.

19.

Sharma, J. K. (2012). Business Statistics, Pearson Education India.

20.

Sheskin, D. J. (2003). Handbook of Parametric and Nonparametric Statistical Procedures: Third Edition, CRC Press.

21.

Stanley, T. D. and Doucouliagos, H. (2012) Meta-Regression Analysis in Economics and Business, Routledge.

22.

Vanderlinden, L. A., Saba, L. M., Kechris, K., Miles, M. F., Hoffman, P. L. and Tabakoff, B. (2013). Whole brain and brain regional coexpression network interactions associated with predisposition to alcohol consumption, PLoS ONE, 8, e68878.

23.

Watson-Haigh, N. S., Kadarmideen, H. N. and Reverter, A. (2010). PCIT: An R Package for weighted gene co-expression networks based on partial correlation and information theory approaches, Bioinformatics, 26, 411-413.

24.

Weatherburn, C. E. (1968). A First Course Mathematical Statistics, Cambridge.

25.

Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics, John Wiley & Sons, New York.