Predicting Survival of DLBCL Patients in Pathway-Based Microarray Analysis

DLBCL 환자의 대사경로 정보를 이용한 생존예측

Lee, Kwang-Hyun;Lee, Sun-Ho

  • Received : 20100400
  • Accepted : 20100500
  • Published : 2010.08.31


Predicting survival from microarray data is not easy due to the problem of high dimensionality of data and the existence of censored observations. Also the limitation of individual gene analysis causes the shift of focus to the level of gene sets with functionally related genes. For developing a survival prediction model based on pathway information, the methods for selecting a supergene using principal component analysis and testing its significance for each pathway are discussed. Besides, the performance of gene filtering is compared.


Microarray experiment;survival analysis;pathway;principal component analysis;proportional hazards model


  1. Adewale, A. J., Dinu, I., Potter, J. D., Liu, Q. and Yasui, Y. (2008). Pathway analysis of microarray data via regression, Journal of Computational Biology, 15, 269-277.
  2. Bair, E. and Tibshirani, R. (2004). Semi-supervised methods to predict patient survival from gene Down-loaded from gene expression data, PLOS Biology, 2, 511-522.
  3. Brier, G. W. (1950). Verification of forecasts expressed in terms of probability, Monthly Weather Review, 78, 1-3.<0001:VOFEIT>2.0.CO;2
  4. Chen, X. and Wang, L. (2009). Integrating Biological Knowledge with Gene Expression Profiles for Survival Prediction of Cancer, Journal of Computational Biology, 16, 265-278.
  5. Chen, X., Wang, L., Smith, J. D. and Zhang, B. (2008). Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes, Bioinformatics, 24, 2479-2481.
  6. Dinu, I., Potter, J. D., Mueller, T., Liu, Q., Adewale, A. J., Jhangri, G. S., Einecke, G., Famulski, K. S., Halloran, P. and Yasui, Y. (2007). Improving gene set analysis of microarray data by SAM-GS, BMC Bioinformatics, 8, 242.
  7. Goeman, J. J., Oosting, J., Cleton-Jansen, A. M., Anninga, J. K. and van Houwelingen, H. C. (2005). Testing association of a pathway with survival using gene expression data, Bioinformatics, 21, 1950-1957.
  8. Goeman, J. J., van de Geer, S. A., de Kort, F. and van Houwelingen, H. C. (2004). A global test for groups of genes: Testing association with a clinical outcome, Bioinformatics, 20, 93-99.
  9. Graf, E., Schmoor, C., Sauerbrei, W. and Schumacher, M. (1999). Assessment and comparison of prognostic classification schemes for survival data, Statistics in Medicine, 18, 2529-2545.<2529::AID-SIM274>3.0.CO;2-5
  10. Hastie, T., Tibshirani, R. and Friedman, J. H. (2001). The Elements of Statistical Learning, Data Mining, Inference, and Prediction, Springer-Verlag, New York.
  11. Heagerty, P. J., Lumley, T. and Pepe, M. S. (2000). Time-dependent ROC curves for censored survival data and a diagnostic marker, Biometrics, 56, 337-344.
  12. Kerr M. and Churchill, G. (2001). Experimental design for gene expression microarrays, Biostatistics, 2, 183-201.
  13. Kim, S. Y. and Volsky, D. J. (2005). PAGE: Parametric analysis of gene set enrichment, BMC Bioinformatics, 6, 14.
  14. Ma, X. J., Wang, Z., Ryan, P. D., Isakoff, S. J., Barmettler, A., Fuller, A., Muir, B., Mohapatra, G., Salunga, R., Tuggle, J. T., Tran, Y., Tran, D., Tassin, A., Amon, P., Wang, W., Wang, W., Enright, E., Stecker, K., Estepa-Sabal, E., Smith, B., Younger, J., Balis, U., Michaelson, J., Bhan, A., Habin, K., Baer, T. M., Brugge, J., Haber, D. A., Erlander, M. G. and Sgroi, D. C. (2004). A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen, Cancer Cell, 5, 607-616.
  15. Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., Houstis, N., Daly, M. J., Patterson, N., Mesirov, J. P., Golub, T. R., Tamayo, P., Spiegelman, B., Lander, E. S., Hirschhorn, J. N., Altshuler, D. and Groop, L. C. (2003). PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes, Nature Genetics, 34, 267-273.
  16. Rosenwald, A., Wright, G., Chan, W. C., Connors, J. M., Campo, E., Fisher, R. I., Gascoyne, R. D., Muller-Hermelink, H. K., Smeland, E. B., Giltnane, J. M., Hurt, E. M., Zhao, H., Averett, L., Yang, L., Wilson, W. H., Jaffe, E. S., Simon, R., Klausner, R. D., Powell, J., Duffey, P. L., Longo, D. L., Greiner, T. C., Weisenburger, D. D., Sanger, W. G., Dave, B. J., Lynch, J. C., Vose, J., Armitage, J. O., Montserrat, E., Lopez-Guillermo, A., Grogan, T. M., Miller, T. P., LeBlanc, M., Ott, G., Kvaloy, S., Delabie, J., Holte, H., Krajci, P., Stokke, T. and Staudt, L. M. (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse large B-cell lymphoma, The New England Journal of Medicine, 346, 1937-1947.
  17. Simon, R., Radmacher, M. D., Dobbin, K. and McShane, L. M. (2003). Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification, Journal of National Cancer Institutes, 95, 14-18.
  18. Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L. Golub, T. R., Lander, E. S. and Mesirov, J. P. (2005). Gene set enrichment analysis:A knowledge-based approach for interpreting genome-wide expression profiles, PNAS, 102, 15545-15550.
  19. Tibshirani, R. (1997). The Lasso method for variable selection in the cox model, Statistics in Medicine, 16, 385-395.<385::AID-SIM380>3.0.CO;2-3
  20. Tibshirani, R., Hastie, T. Narasimhan, B. and Chu, G. (2003). Class prediction by nearest shrunken centroids, with applications to DNA microarrays, Statistical Science, 18, 104-117.
  21. Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response, PNAS, 98, 5116-5121.


Supported by : 한국학술진흥재단