A note on Box-Cox transformation and application in microarray data

  • Rahman, Mezbahur (Department of Mathematics and Statistics, Minnesota State University) ;
  • Lee, Nam-Yong (Department of Mathematics and Statistics, Minnesota State University)
  • Received : 2011.04.24
  • Accepted : 2011.08.21
  • Published : 2011.10.01

Abstract

The Box-Cox transformation is a well known family of power transformations that brings a set of data into agreement with the normality assumption of the residuals and hence the response variable of a postulated model in regression analysis. Normalization (studentization) of the regressors is a common practice in analyzing microarray data. Here, we implement Box-Cox transformation in normalizing regressors in microarray data. Pridictabilty of the model can be improved using data transformation compared to studentization.

Keywords

References

  1. Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D. and Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences USA, 96, 6745-6750. https://doi.org/10.1073/pnas.96.12.6745
  2. Bickel, P. J. and Doksum, K. A. (1981). An analysis of transformations revisited. Journal of the American Statistical Association, 76, 296-311. https://doi.org/10.1080/01621459.1981.10477649
  3. Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society B, 26, 211-252.
  4. Box, G. E. P. and Cox, D. R. (1982). An analysis of transformations revisited (rebutted). Journal of the American Statistical Association, 77, 209-210. https://doi.org/10.1080/01621459.1982.10477788
  5. Carroll, R. J. (1980). A robust method for testing transformations to achieve approximate normality. Journal of the Royal Statistical Society B, 42, 71-78.
  6. Ekstrom, C. T., Bak, S., Kristensen, C. and Rudemo, M. (2004). Spot shape modelling and data transformations for microarrays. Bioinformatics, 20, 2270-2278. https://doi.org/10.1093/bioinformatics/bth237
  7. Giles, P. J. and Kipling, D. (2003). Normality of oligonucleotide microarray data and implications for parametric statistical analyses. Bioinformatics, 19, 2254-2262. https://doi.org/10.1093/bioinformatics/btg311
  8. Halawa, A. M. (1996). Estimating the Box-Cox transformation via an artificial regression model. Communications in Statistics - Simulation and Computation, 25, 331-350. https://doi.org/10.1080/03610919608813317
  9. Harter, H. L. (1961). Expected values of normal order statistics. Biometrika, 48, 151-165. https://doi.org/10.1093/biomet/48.1-2.151
  10. Hinkley, D. V. (1975). On power transformation to symmetry. Biometrika, 62, 101-111. https://doi.org/10.1093/biomet/62.1.101
  11. Hinkley, D. V. (1977). On quick choice of power transformation. Applied Statistics, 26, 67-68. https://doi.org/10.2307/2346869
  12. Lin, L. I. and Vonesh, E. F. (1989). An empirical nonlinear data-fitting approach for transforming data to normality. American Statistician, 43, 237-243.
  13. Parish, R. S. (1992a). Computing expected values of normal order statistics. Communications in Statistics - Simulation and Computation, 21, 57-70. https://doi.org/10.1080/03610919208813008
  14. Parish, R. S. (1992b). Computing variances and covariances of normal order statistics. Communications in Statistics - Simulation and Computation, 21, 71-101. https://doi.org/10.1080/03610919208813009
  15. Pearson, E. S., D'Agostino, R. B. and Bowman, K. O. (1977). Tests for departure from normality: Comparison of powers. Biometrika, 64, 231-246. https://doi.org/10.1093/biomet/64.2.231
  16. Rahman, M. (1999). Estimating the Box-Cox transformation via Shapiro-Wilk W statistic. Communications in Statistics - Simulation and Computation, 28, 223-241. https://doi.org/10.1080/03610919908813545
  17. Rahman, M. and Pearson, L. M. (2000). Shapiro-Francia W' statistic using exclusive simulation. Journal of the Korean Data & Information Sciences Society, 11, 139-155.
  18. Rahman, M. and Pearson, L. M. (2008). A note on the maximum likelihood Box-Cox transformation parameter. Journal of Probability and Statistical Science, 6, 155-168.
  19. Shapiro, S. S. and Wilk, M. B. (1965). An analysis of variance test for normality. Biometrika, 52, 591-611. https://doi.org/10.1093/biomet/52.3-4.591
  20. Shapiro, S. S., Wilk, M. B. and Chen, H. J. (1968). A comparative study of various tests of normality. Journal of the American Statistical Association, 63, 1343-1372. https://doi.org/10.1080/01621459.1968.10480932
  21. Taylor, J. M. G. (1985). Power transformations to symmetry. Annals of Mathematical Statistics, 33, 1-67.
  22. Yang, Y. H. and Throne, N. P. (2003). Normalization for two-color cDNA microarray data. Institute of Mathematical Statistics, 40, 403-418.