Statistical Method for Implementing the Experimenter Effect in the Analysis of Gene Expression Data

  • Kim, In-Young (Department of Epidemiology and Public Health, School of Medicine, Yale university) ;
  • Rha, Sun-Young (Brain Korea 21 Project for Medical Science, College of Medicine, Yonsei University) ;
  • Kim, Byung-Soo (Department of Applied Statistics, Yonsei University)
  • Published : 2006.12.31


In cancer microarray experiments, the experimenter or patient which is nested in each experimenter often shows quite heterogeneous error variability, which should be estimated for identifying a source of variation. Our study describes a Bayesian method which utilizes clinical information for identifying a set of DE genes for the class of subtypes as well as assesses and examines the experimenter effect and patient effect which is nested in each experimenter as a source of variation. We propose a Bayesian multilevel mixed effect model based on analysis of covariance (ANACOVA). The Bayesian multilevel mixed effect model is a combination of the multilevel mixed effect model and the Bayesian hierarchical model, which provides a flexible way of defining a suitable correlation structure among genes.


  1. Bryk, A. and Raudenhush, S. (1992). Hierarchical Linear Models for Social and Behavioral Research, Sages, Newbury Park, CA
  2. Chu, T.M., Weir, B. and Wolfinger, R. (2002). A systematic statistical linear modelling approach to oligonucleotide array experiments. Mathematical Biosciences, Vol. 176, 35-51
  3. Dudoit, S., Yang, Y.H., Callow, M.J, and Speed, T.P. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistical Sinica, Vol. 12, 111-139
  4. Hsieh, W.P., Chu, T.M., Weir, B., Wolfinger, R. and Gibson, G. (2003). Mixed model reanalysis of primate data suggests tissue and species biases in aligonuc1eotide-based gene expression profiles. Genetics, Vol. 165, 747 -757
  5. Ibrahim, J., Chen, M.H. and Gray, R. (2002). Bayesian models for gene expression with DNA microarray data. Journal of American Statistical Association, Vol. 97, 88-99
  6. Jomsten, R., Wang, H.Y, Welsh, W.J. (2005). DNA microarray data imputation and significance analysis of differential expression. Bioirformatics, Vol. 2, 4155-4161
  7. Kerr, M.K. and Churchill, G.A. (2001). Experimental design for gene expression microarrays, Biostatistics, Vol. 2, 183-201
  8. Kerr, Martin, M. and Churchill, G.A. (2000). Analysis of variance for gene expression microarray data.. Journal of Computational Biology, Vol. 7, 819-837
  9. Kim, B.S., Kim, I., Lee, S., Kim, S., Rha, S.Y. and Chung, H.C. (2005). Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer. Bioiniormatics, Vol. 21, 517 -528
  10. Kim, H., Goulb G.,H., Park, H. (2005). Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics, Vol. 2, 187-198
  11. Park, C.H., Jeong, H.J., Jung, J.J., Lee, G.Y., Kim, T.S., Yang, S.H., Chung, H. C. and Rha, S.Y. (2004). Fabrication of high quality cDNA microarray using a small amount of cDNA International. Journal of Molecular Medicine, Vol. 13, 675-679
  12. Rosenwald, A, Wright, G., Chan, W., Connors, J.M., Campo, E., Fisher, R.I., Gascoyne, R.D., Konrad Muller-Hermelink, H., Smeland, E.B., Staudt, L. M. (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse Iarge-Br-cell Lymphoma. New England Journal of Medicine, Vol. 346. 1937-1947
  13. Scheel, I., Aldrin, M., Glad, I.K., Sorum, R., Lyng. H., Frigessi, A. (2005). The influence of missing value imputation on detection of differentially expressed genes from microarray data. Bioirformatics, Vol. 21, 4277-4279
  14. Tadesse, M.G., Ibrahim, J.G. and Mutter, G.L. (2003). Identification of differentially expressed genes in high-density oligonucleotide arrays accounting for the quantification limits of the technology. Biometirics, Vol. 59, 542-554
  15. Townsend, J.P. and Hartl, D.L. (2002). Bayesian analysis of gene expression levels: statistical quantification of relative mRNA level across multiple strains or treatments. Genome Biology, Vol. 3(12), research0071.1 -0071.16
  16. Tusher, V., Tibshirani, Rand Chu, C. (2001). Significance analysis of microarrays applied to transcriptional responses to ionizing radiation. Proceedings of the National Academy of Sciences, US.A, Vol. 98, 5116-5121
  17. van de Vijver, M.J., He, Y.D., van't Veer, L.J., Dai, H., Hart, A.M., voskuil, D.W., Schreiber, G.J., Peterse, J.L., Roberts, C., Marton, M.J., Parrich, M., Atsma, D., Witteveen, A, Glas, A, Delahaye, L., van der Velde, T., Bartelink, H., Rodenhuis, S., Rutgers, E.T., Friend, S.H., Bernards, R (2002). A Gene-Expression Signature as a predictor of Survival in Breast Cancer. New England Journal of Medicine, Vol. 347. 1999-2009
  18. Wolginger, R.D., Gibson, G., Wolfinger, E.D., Bennett, L., Harnadeh, H., Bushel, P., Afshari C., Paules, RS. (2001). Assessing gene significance from cDNA microarray expression data via mixed models. Journal of Computational Biology, Vol. 8, 625-637
  19. Yang, Y.H., Dudoit, S., Luu, P., Lin, D.M., Peng, V. , Ngai, J., Speed, T.P. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and slide scale adjustment systematic variation. Nucleic Acids Research, Vol. 30, e15