DOI QR코드

DOI QR Code

A Comparative Study of Microarray Data with Survival Times Based on Several Missing Mechanism

Kim Jee-Yun;Hwang Jin-Soo;Kim Seong-Sun

  • 발행 : 2006.04.01

초록

One of the most widely used method of handling missingness in microarray data is the kNN(k Nearest Neighborhood) method. Recently Li and Gui (2004) suggested, so called PCR(Partial Cox Regression) method which deals with censored survival times and microarray data efficiently via kNN imputation method. In this article, we try to show that the way to treat missingness eventually affects the further statistical analysis.

키워드

microarray;missingness;PCR;imputation

참고문헌

  1. Bishop, C.M. (1999). Variational principal components. In IEE Conference Publication on Artificial Neural Networks, 509-514
  2. Efron, B., Johnston, I., Hastie, T. and Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, Vol. 32, 407-499 https://doi.org/10.1214/009053604000000067
  3. Gui, J. and Li, H. (2004). Penalized Cox Regression Analysis in the High-Dimensional and Low-sample Size Settings with Applications to Mi-croarray Gene Expression Data. Center for Bioinformatics & Molecular Biostatistics
  4. Kim, H., Golub, G.H. and Park, H. (2005). Missing value estimation for DNA microarray gene expression data : local least squares imputation. Bioinformatics, Vol. 21, 187-198 https://doi.org/10.1093/bioinformatics/bth499
  5. Kim, K.Y., Kim, B.J. and Yi, G.S. (2004). Reuse of imputed data in microarray analysis increases imputation efficiency. BMC Bioinformatics, Vol. 5, 160 https://doi.org/10.1186/1471-2105-5-160
  6. Li, H. and Gui, J. (2004). Partial Cox regression analysis for high-dimensional microarray gene expression data. Bioinformatics, Vol. 20, i208-i215 https://doi.org/10.1093/bioinformatics/bth900
  7. Li, H. and Luan, Y. (2003). Kernel Cox regression models for linking gene expression profiles to censored survival data. Pacific Symposium on Biocomputing, 65-76
  8. Oba, S., Sato, M., Takemasa, I., Monden, M., Matsubara, K. and Ishii, S. (2003). A Bayesian missing value estimation method for gene expression profile data. Bioinformatics, Vol. 19, 2088-2096 https://doi.org/10.1093/bioinformatics/btg287
  9. Rosenwald, A., Wright, G., Chan, W.C, Connors, J.M., Campo, E., Fisher, R.I., Gascoyne, R.D., Muller-Hermelink, H.K., Smeland, E.B. and Staudt, L.M. (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. The New England Journal of Medicine, Vol. 346, 1937-1947 https://doi.org/10.1056/NEJMoa012914
  10. Rubin, D.B. (1977). Formalizing subjective notions about the effect of nonrespondents in sample surveys. Journal of the American Statistical Association, Vol. 72, 538-543 https://doi.org/10.2307/2286214
  11. Segal, M.R. (2005). Microarray gene expression data with linked survival phenotypes : Diffuse large- B-cell lymphoma revisited. Center for Bioinformatics & Molecular Biostatistics
  12. Tibshirani, R. (1997). The Lasso method for variable selection in the Cox model. Statistics in Medicine, Vol. 16, 385-395 https://doi.org/10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3
  13. Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, Vol. 67, 301-320 https://doi.org/10.1111/j.1467-9868.2005.00503.x
  14. Bo, T.H., Dysvik, B. and Jonassen, I. (2004). Lsimpute : accurate estimation of missing values in microarray data with least squares methods. Nucleic Acids Research, Vol. 32, No.3 e34 https://doi.org/10.1093/nar/gnh026
  15. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R, Botstein, D. and Altman, R.B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics, Vol. 17, 520-525 https://doi.org/10.1093/bioinformatics/17.6.520
  16. Hastie, T., Alter, O., Sherlock, G., Eisen, M., Tibshirani, R., Botstein, D. and Brown, P. (1999). Imputation of missing values in DNA microarrays. Technical report Stanford University Statistics Department
  17. Park, P.J., Tian, L. and Kohane, I.S. (2002). Linking gene expression data with patient survival times using partial least squares. Bioinformatics, Vol. 18, S120-S127 https://doi.org/10.1093/bioinformatics/18.suppl_1.S120