A Report on the Inter-Gene Correlations in cDNA Microarray Data Sets

Title & Authors
A Report on the Inter-Gene Correlations in cDNA Microarray Data Sets
Kim, Byung-Soo; Jang, Jee-Sun; Kim, Sang-Cheol; Lim, Jo-Han;

Abstract
A series of recent papers reported that the inter-gene correlations in Affymetrix microarray data sets were strong and long-ranged, and the assumption of independence or weak dependence among gene expression signals which was often employed without justification was in conflict with actual data. Qui et al. (2005) indicated that applying the nonparametric empirical Bayes method in which test statistics were pooled across genes for performing the statistical inference resulted in the large variance of the number of differentially expressed genes. Qui et al. (2005) attributed this effect to strong and long-ranged inter-gene correlations. Klebanov and Yakovlev (2007) demonstrated that the inter-gene correlations provided a rich source of information rather than being a nuisance in the statistical analysis and they developed, by transforming the original gene expression sequence, a sequence of independent random variables which they referred to as a $\small{{\delta}}$-sequence. We note in this report using two cDNA microarray data sets experimented in this country that the strong and long-ranged inter-gene correlations were still valid in cDNA microarray data and also the $\small{{\delta}}$-sequence of independence could be derived from the cDNA microarray data. This note suggests that the inter-gene correlations be considered in the future analysis of the cDNA microarray data sets.
Keywords
cDNA microarray;nonparametric empirical Bayes method;correlation;independence;differential expression;
Language
Korean
Cited by
1.
당귀(當歸)가 다낭성난소증후군이 유발된 흰쥐 난소조직의 유전자 발현에 미치는 영향,류기준;조성희;

대한한방부인과학회지, 2011. vol.24. 3, pp.28-47
2.
특이발현과 특이공발현을 고려한 유의한 유전자 집단 탐색,이선호;

응용통계연구, 2016. vol.29. 3, pp.437-448
1.
Identifying statistically significant gene sets based on differential expression and differential coexpression, Korean Journal of Applied Statistics, 2016, 29, 3, 437
References
1.
Efron, B. (2003). Robbins, empirical Bayes and microarrays, The Annals of Statistics, 31, 366-378

2.
Efron, B. (2004). Large-scale simultaneous hypothesis testing: The choice of a null hypothesis, Journal of the American Statistical Association, 99, 96-104

3.
Efron, B. (2007). Correlation and large-scale simultaneous significance testing, Journal of the American Statistical Association, 102, 93-103

4.
Efron, B., Tibshirani, R., Storey, J. D. and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment, Journal of the American Statistical Association, 96, 1151-1160

5.
Frantz,S. (2005). An array of problems, Nature Reviews Drug Discovery, 4, 302-303

6.
Kim, B. S., Kim, I., Lee, S., Kim, S., Rha, S. Y. and Chung, H. C. (2005). Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer, Bioinformatics, 21, 517-528

7.
Klebanov, L., Jordan, C. and Yakovlev, A. (2006). A new type of stochastic dependence revealed in gene expression data, Statistical Applications in Genetics and Molecular Biology, 5, Ariticle 7

8.
Klebanov, L. and Yakovlev, A. (2006). Treating expression levels of different genes as a sample in microarray data analysis: Is it worth a risk?, Statistical Applications in Genetics and Molecular Biology, 5, Ariticle 9

9.
Klebanov, L. and Yakovlev, A. (2007). Diverse correlation structures in gene expression data and their utility in improving statistical inference, The Annals oj Applied Statistics, 1, 538-559

10.
Marshall, E. (2004). Getting the noise out of gene arrays, Science, 306, 630-631

11.
Qui, X., Brooks, A. I., Klebanov, L. and Yakovlev, A. (2005a). The effects of normalization on the correlation structure of microarray data, BMC Bioinformatics, 6, 120

12.
Qui, X., Klebanov, L. and Yakovlev, A. (2005b). Correlation between gene expression levels and limitations of the empirical Bayes methodology for finding differentially expressed genes, Statistical Applications in Genetics and Molecular Biology, 4, Ariticle 34

13.
Qui, X., Xiao, Y., Gordon, A. and Yakovlev, A. (2006). Assessing stability of gene selection in microarray data analysis, BMC Bioinformatics, 7, 50

14.
Qui, X. and Yakovlev, A. (2006). Some comments on instability of false discovery rate estimation, Journal of Bioinformatics and Computational Biology, 4, 1057-1068

15.
Stolovitzky, G. (2003). Gene selection in microarray data: The elephant, the blind men and our algorithm, Current Opinions in Structural Biology, 13, 370-376

16.
Yang,S., Jeung, H. C., Jeong, H. J., Choi, Y. H., Kim, J. E., Jung, J. J., Rha, S. Y., Yang, W. I. and Chung, H. C. (2007a). Identification of genes with correlated patterns of variations in DNA copy number and gene expression level in gastric cancer, Genomics, 89, 451-459

17.
Yang, S., Shin, J., Park, K. H., Jeung, H-C., Rha, S. Y., Noh, S. H., Yang, W. I. and Chung, H. C. (2007b). Molecular basis of the difference between normal and tumor tissues of gastric cancer, Biochimica et Biophysica Acta, 1772, 1033-1040