DOI QR코드

DOI QR Code

Databases and tools for constructing signal transduction networks in cancer

  • Received : 2016.08.08
  • Published : 2017.01.31

Abstract

Traditionally, biologists have devoted their careers to studying individual biological entities of their own interest, partly due to lack of available data regarding that entity. Large, high-throughput data, too complex for conventional processing methods (i.e., "big data"), has accumulated in cancer biology, which is freely available in public data repositories. Such challenges urge biologists to inspect their biological entities of interest using novel approaches, firstly including repository data retrieval. Essentially, these revolutionary changes demand new interpretations of huge datasets at a systems-level, by so called "systems biology". One of the representative applications of systems biology is to generate a biological network from high-throughput big data, providing a global map of molecular events associated with specific phenotype changes. In this review, we introduce the repositories of cancer big data and cutting-edge systems biology tools for network generation, and improved identification of therapeutic targets.

Keywords

References

  1. Werner HM, Mills GB and Ram PT (2014) Cancer Systems Biology: a peek into the future of patient care? Nat Rev Clin Oncol 11, 167-176 https://doi.org/10.1038/nrclinonc.2014.6
  2. Soon WW, Hariharan M and Snyder MP (2013) Highthroughput sequencing for biology and medicine. Mol Syst Biol 9, 640
  3. Chuang HY, Hofree M and Ideker T (2010) A decade of systems biology. Annu Rev Cell Dev Biol 26, 721-744 https://doi.org/10.1146/annurev-cellbio-100109-104122
  4. Jost D, Nowojewski A and Levine E (2011) Small RNA biology is systems biology. BMB Rep 44, 11-21 https://doi.org/10.5483/BMBRep.2011.44.1.11
  5. Nam S, Long X, Kwon C, Kim S and Nephew KP (2012) An integrative analysis of cellular contexts, miRNAs and mRNAs reveals network clusters associated with antiestrogen-resistant breast cancer cells. BMC Genomics 13, 732 https://doi.org/10.1186/1471-2164-13-732
  6. Rho S, You S, Kim Y and Hwang D (2008) From proteomics toward systems biology: integration of different types of proteomics data into network models. BMB Rep 41, 184-193 https://doi.org/10.5483/BMBRep.2008.41.3.184
  7. Nam S, Chang HR, Kim KT et al (2014) PATHOME: an algorithm for accurately detecting differentially expressed subpathways. Oncogene 33, 4941-4951 https://doi.org/10.1038/onc.2014.80
  8. Nam S and Park T (2012) Pathway-based evaluation in early onset colorectal cancer suggests focal adhesion and immunosuppression along with epithelial-mesenchymal transition. PLoS One 7, e31685 https://doi.org/10.1371/journal.pone.0031685
  9. Altaf-Ul-Amin M, Afendi FM, Kiboi SK and Kanaya S (2014) Systems biology in the context of big data and networks. Biomed Res Int 2014, 428570
  10. Marx V (2013) Drilling into big cancer-genome data. Nat Meth 10, 293-297 https://doi.org/10.1038/nmeth.2410
  11. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA et al (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113-1120 https://doi.org/10.1038/ng.2764
  12. Zhang J, Baran J, Cros A et al (2011) International Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics data. Database (Oxford) 2011, bar026
  13. Ghosh S, Matsuoka Y, Asai Y, Hsin KY and Kitano H (2011) Software for systems biology: from tools to integrated platforms. Nat Rev Genet 12, 821-832 https://doi.org/10.1038/nrg3096
  14. Zierer J, Menni C, Kastenmuller G and Spector TD (2015) Integration of 'omics' data in aging research: from biomarkers to systems biology. Aging Cell 14, 933-944 https://doi.org/10.1111/acel.12386
  15. Pecina-Slaus N and Pecina M (2015) Only one health, and so many omics. Cancer Cell Int 15, 64 https://doi.org/10.1186/s12935-015-0212-2
  16. International Cancer Genome Consortium, Hudson TJ, Anderson W et al (2010) International network of cancer genome projects. Nature 464, 993-998 https://doi.org/10.1038/nature08987
  17. Encode Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57-74 https://doi.org/10.1038/nature11247
  18. Tryka KA, Hao L, Sturcke A et al (2014) NCBI's Database of Genotypes and Phenotypes: dbGaP. Nucleic Acids Res 42, D975-979 https://doi.org/10.1093/nar/gkt1211
  19. Barrett T, Wilhite SE, Ledoux P et al (2013) NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res 41, D991-995
  20. Rocca-Serra P, Brazma A, Parkinson H et al (2003) ArrayExpress: a public database of gene expression data at EBI. C R Biol 326, 1075-1078 https://doi.org/10.1016/j.crvi.2003.09.026
  21. Kusebauch U, Deutsch EW, Campbell DS, Sun Z, Farrah T and Moritz RL (2014) Using PeptideAtlas, SRMAtlas, and PASSEL: Comprehensive Resources for Discovery and Targeted Proteomics. Curr Protoc Bioinformatics 46, 13.25.1-13.25.28
  22. Jones P and Cote R (2008) The PRIDE proteomics identifications database: data submission, query, and dataset comparison. Methods Mol Biol 484, 287-303
  23. Liberzon A, Birger C, Thorvaldsdottir H, Ghandi M, Mesirov JP and Tamayo P (2015) The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst 1, 417-425 https://doi.org/10.1016/j.cels.2015.12.004
  24. Nam S, Li M, Choi K, Balch C, Kim S and Nephew KP (2009) MicroRNA and mRNA integrated analysis (MMIA): a web tool for examining biological functions of microRNA expression. Nucleic Acids Res 37, W356-362 https://doi.org/10.1093/nar/gkp294
  25. Subramanian A, Tamayo P, Mootha VK et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545-15550 https://doi.org/10.1073/pnas.0506580102
  26. Maciejewski H (2014) Gene set analysis methods: statistical models and methodological differences. Brief Bioinform 15, 504-518 https://doi.org/10.1093/bib/bbt002
  27. Emmert-Streib F, Tripathi S and de Matos Simoes R (2012) Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods. Biol Direct 7, 44 https://doi.org/10.1186/1745-6150-7-44
  28. Jung S and Kim S (2014) EDDY: a novel statistical gene set test method to detect differential genetic dependencies. Nucleic Acids Res 42, e60 https://doi.org/10.1093/nar/gku099
  29. Kanehisa M and Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27-30 https://doi.org/10.1093/nar/28.1.27
  30. Croft D, Mundo AF, Haw R et al (2014) The Reactome pathway knowledgebase. Nucleic Acids Res 42, D472-477 https://doi.org/10.1093/nar/gkt1102
  31. Szklarczyk D, Franceschini A, Wyder S et al (2015) STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43, D447-452 https://doi.org/10.1093/nar/gku1003
  32. Yu N, Seo J, Rho K et al (2012) hiPathDB: a humanintegrated pathway database with facile visualization. Nucleic Acids Res 40, D797-802 https://doi.org/10.1093/nar/gkr1127
  33. Hucka M, Finney A, Sauro HM et al (2003) The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524-531 https://doi.org/10.1093/bioinformatics/btg015
  34. Schaefer CF, Anthony K, Krupa S et al (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37, D674-679 https://doi.org/10.1093/nar/gkn653
  35. Nishimura D (2001) BioCarta. Biotech Software & Internet Report 2, 117-120 https://doi.org/10.1089/152791601750294344
  36. Allen JD, Xie Y, Chen M, Girard L and Xiao G (2012) Comparing statistical methods for constructing large scale gene networks. PLoS One 7, e29348 https://doi.org/10.1371/journal.pone.0029348
  37. Butte AJ and Kohane IS (2000) Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac Symp Biocomput, 418-429
  38. Markowetz F and Spang R (2007) Inferring cellular networks--a review. BMC Bioinformatics 8 Suppl 6, S5
  39. Margolin AA, Nemenman I, Basso K et al (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 Suppl 1, S7
  40. Tanaka H and Ogishima S (2015) Network biology approach to epithelial-mesenchymal transition in cancer metastasis: three stage theory. J Mol Cell Biol 7, 253-266 https://doi.org/10.1093/jmcb/mjv035
  41. Zhang B and Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4, 12287
  42. Bailey P, Chang DK, Nones K et al (2016) Genomic analyses identify molecular subtypes of pancreatic cancer. Nature 531, 47-52 https://doi.org/10.1038/nature16965
  43. Gnad F, Doll S, Manning G, Arnott D and Zhang Z (2015) Bioinformatics analysis of thousands of TCGA tumors to determine the involvement of epigenetic regulators in human cancer. BMC Genomics 16 Suppl 8, S5
  44. Horvath S, Zhang B, Carlson M et al (2006) Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target. Proc Natl Acad Sci U S A 103, 17402-17407 https://doi.org/10.1073/pnas.0608396103
  45. Kling T, Johansson P, Sanchez J, Marinescu VD, Jornsten R and Nelander S (2015) Efficient exploration of pancancer networks by generalized covariance selection and interactive web content. Nucleic Acids Res 43, e98 https://doi.org/10.1093/nar/gkv413
  46. Jarvstrat L, Johansson M, Gullberg U and Nilsson B (2013) Ultranet: efficient solver for the sparse inverse covariance selection problem in gene network modeling. Bioinformatics 29, 511-512 https://doi.org/10.1093/bioinformatics/bts717
  47. Storry JR, Joud M, Christophersen MK et al (2013) Homozygosity for a null allele of SMIM1 defines the Vel-negative blood group phenotype. Nat Genet 45, 537-541 https://doi.org/10.1038/ng.2600
  48. Yu J, Smith VA, Wang PP, Hartemink AJ and Jarvis ED (2004) Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics 20, 3594-3603 https://doi.org/10.1093/bioinformatics/bth448
  49. Frolova A and Wilczynski B (2015) Distributed Bayesian Networks Reconstruction on the Whole Genome Scale. bioRxiv doi: 10.1101/016683
  50. Salzman P and Almudevar A (2006) Using complexity for the estimation of Bayesian networks. Stat Appl Genet Mol Biol 5, Article21
  51. Chen X, Chen M and Ning K (2006) BNArray: an R package for constructing gene regulatory networks from microarray data by using Bayesian network. Bioinformatics 22, 2952-2954 https://doi.org/10.1093/bioinformatics/btl491
  52. Bansal M, Belcastro V, Ambesi-Impiombato A and di Bernardo D (2007) How to infer gene networks from expression profiles. Mol Syst Biol 3, 78
  53. Adabor ES, Acquaah-Mensah GK and Oduro FT (2015) SAGA: a hybrid search algorithm for Bayesian Network structure learning of transcriptional regulatory networks. J Biomed Inform 53, 27-35 https://doi.org/10.1016/j.jbi.2014.08.010
  54. Volinia S, Galasso M, Costinean S et al (2010) Reprogramming of miRNA networks in cancer and leukemia. Genome Res 20, 589-599 https://doi.org/10.1101/gr.098046.109
  55. Madhamshettiwar PB, Maetschke SR, Davis MJ, Reverter A and Ragan MA (2012) Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets. Genome Med 4, 41 https://doi.org/10.1186/gm340
  56. Galvanauskas V, Simutis R and Lubbert A (2004) Hybrid process models for process optimisation, monitoring and control. Bioprocess Biosyst Eng 26, 393-400 https://doi.org/10.1007/s00449-004-0385-x
  57. Khatri P, Sirota M and Butte AJ (2012) Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol 8, e1002375 https://doi.org/10.1371/journal.pcbi.1002375
  58. Tarca AL, Draghici S, Khatri P et al (2009) A novel signaling pathway impact analysis. Bioinformatics 25, 75-82 https://doi.org/10.1093/bioinformatics/btn577
  59. Smith BA, Sokolov A, Uzunangelov V et al (2015) A basal stem cell signature identifies aggressive prostate cancer phenotypes. Proc Natl Acad Sci U S A 112, E6544-6552 https://doi.org/10.1073/pnas.1518007112
  60. Hong Y, Ho KS, Eu KW and Cheah PY (2007) A susceptibility gene set for early onset colorectal cancer that integrates diverse signaling pathways: implication for tumorigenesis. Clin Cancer Res 13, 1107-1114 https://doi.org/10.1158/1078-0432.CCR-06-1633
  61. Nam S, Chang HR, Jung HR et al (2015) A pathway-based approach for identifying biomarkers of tumor progression to trastuzumab-resistant breast cancer. Cancer Lett 356, 880-890 https://doi.org/10.1016/j.canlet.2014.10.038
  62. Vogelstein B and Kinzler KW (2004) Cancer genes and the pathways they control. Nat Med 10, 789-799 https://doi.org/10.1038/nm1087
  63. Huang da W, Sherman BT and Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4, 44-57 https://doi.org/10.1038/nprot.2008.211
  64. Chang HR, Nam S, Kook MC et al (2016) HNF4alpha is a therapeutic target that links AMPK to WNT signalling in early-stage gastric cancer. Gut 65, 19-32 https://doi.org/10.1136/gutjnl-2014-307918
  65. Chang HR, Park HS, Ahn YZ et al (2016) Improving gastric cancer preclinical studies using diverse in vitro and in vivo model systems. BMC Cancer 16, 200 https://doi.org/10.1186/s12885-016-2232-2
  66. Bang YJ, Van Cutsem E, Feyereislova A et al (2010) Trastuzumab in combination with chemotherapy versus chemotherapy alone for treatment of HER2-positive advanced gastric or gastro-oesophageal junction cancer (ToGA): a phase 3, open-label, randomised controlled trial. Lancet 376, 687-697 https://doi.org/10.1016/S0140-6736(10)61121-X
  67. Cancer Genome Atlas Research Network (2014) Comprehensive molecular characterization of gastric adenocarcinoma. Nature 513, 202-209 https://doi.org/10.1038/nature13480
  68. Hernansaiz-Ballesteros RD, Salavert F, Sebastian-Leon P, Aleman A, Medina I and Dopazo J (2015) Assessing the impact of mutations found in next generation sequencing data over human signaling pathways. Nucleic Acids Res 43, W270-275 https://doi.org/10.1093/nar/gkv349
  69. Griffith M, Griffith OL, Coffman AC et al (2013) DGIdb: mining the druggable genome. Nat Methods 10, 1209-1210 https://doi.org/10.1038/nmeth.2689
  70. Wishart DS, Knox C, Guo AC et al (2006) DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 34, D668-672 https://doi.org/10.1093/nar/gkj067
  71. Whirl-Carrillo M, McDonagh EM, Hebert JM et al (2012) Pharmacogenomics knowledge for personalized medicine. Clin Pharmacol Ther 92, 414-417 https://doi.org/10.1038/clpt.2012.96
  72. Franz M, Lopes CT, Huck G, Dong Y, Sumer O and Bader GD (2016) Cytoscape.js: a graph theory library for visualisation and analysis. Bioinformatics 32, 309-311
  73. Jang Y, Yu N, Seo J, Kim S and Lee S (2016) MONGKIE: an integrated tool for network analysis and visualization for multi-omics data. Biol Direct 11, 10 https://doi.org/10.1186/s13062-016-0112-y
  74. Smoot ME, Ono K, Ruscheinski J, Wang PL and Ideker T (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431-432 https://doi.org/10.1093/bioinformatics/btq675
  75. Cerami E, Gao J, Dogrusoz U et al (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2, 401-404 https://doi.org/10.1158/2159-8290.CD-12-0095
  76. Parkinson H, Sarkans U, Kolesnikov N et al (2011) ArrayExpress update--an archive of microarray and highthroughput sequencing-based functional genomics experiments. Nucleic Acids Res 39, D1002-1004 https://doi.org/10.1093/nar/gkq1040
  77. Zhu J, Sanborn JZ, Benz S et al (2009) The UCSC Cancer Genomics Browser. Nat Methods 6, 239-240 https://doi.org/10.1038/nmeth0409-239
  78. Barretina J, Caponigro G, Stransky N et al (2012) The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603-607 https://doi.org/10.1038/nature11003

Cited by

  1. Machine learning approaches to decipher hormone and HER2 receptor status phenotypes in breast cancer pp.1477-4054, 2017, https://doi.org/10.1093/bib/bbx138