Advanced SearchSearch Tips
Application of Data Mining for Biomedical Data Processing
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Application of Data Mining for Biomedical Data Processing
Shon, Ho-Sun; Kim, Kyoung-Ok; Cha, Eun-Jong; Kim, Kyung-Ah;
  PDF(new window)
Cancer has been the most frequent in Korea, and pathogenesis and progression of cancer have been known to be occurred through various causes and stages. Recently, the research of chromosomal and genetic disorder and the research about prognostic factor to predict occurrence, recurrence and progress of chromosomal and genetic disorder have been performed actively. In this paper, we analyzed DNA methylation data downloaded from TCGA (The Cancer Genome Atlas), open database, to research bladder cancer which is the most frequent among urinary system cancers. Using three level of methylation data which had the most preprocessing, 59 candidate CpG island were extracted from 480,000 CpG island, and then we analyzed extracted CpG island applying data mining technique. As a result, cg12840719 CpG island were analyzed significant, and in Cox`s regression we can find the CpG island with high relative risk in comparison with other CpG island. Shown in the result of classification analysis, the CpG island which have high correlation with bladder cancer are cg03146993, cg07323648, cg12840719, cg14676825 and classification accuracy is about 76%. Also we found out that positive predictive value, the probability which predicts cancer in case of cancer was 72.4%. Through the verification of candidate CpG island from the result, we can utilize this method for diagnosing and treating cancer.
Bladder cancer;TCGA(The Cancer Genome Atlas);CpG island;ROC;Cox`s regression;
 Cited by
K. W. Jung, Y. J. Won, C. M. Oh, H. J. Kong, H. S. Cho, D. H. Lee, and K. H. Lee, "Prediction of cancer Incidence and mortality in Korea," Cancer Research and Treatment, vol. 47, no. 2, pp. 142-148, 2015. crossref(new window)

J. K. Lee, "Genetic variation and diseases,", 2015.


Y. J. Kim, "Method for Diagnosis of Bladder Cancer using PRAC methylation and a use thereof" 10-2015-0026574, 2015.

J. Han, and M. Kamber, Data Mining: Concepts and Techniques, Third Edition, The Morgan Kaufmann publishers, 2006.

M. Lauss, M. Aine, G. Sjodahl, S. Veerla, O. Patschan, S. Gudjonsson, G. Chebil, K. Lovgren, M. Ferno, W. Mansson, F. Liedberg, M. Ringner, D. Lindgren, and M. Hoglund, "DNA methylation analyses of urothelial carcinoma reveal distinct epigenetic subtypes and an association between gene copy number and methylation status," Epigenetics, vol. 7, no. 8, pp. 858-867, 2012. crossref(new window)

S. H. Cross, and A. P. Bird, "CpG islands and genes," Curr Opin Genet Dev, vol. 5, no. 3, pp. 309-314, 1995. crossref(new window)

M. Xiaou, Y. W. Wang, M. Q. Zhang, and A. F. Gazdar, "DNA methylation data analysis and its application to cancer research," Epigenomics, vol. 5, no. 3, pp. 301-316, 2013. crossref(new window)

J. K. Kim, and D. C. Suh, "Statistical Note on the Survival Analysis," Neurointervention, vol. 4, pp. 6-7, 2009.

Z. D. Stephens, S. Y. Lee, F. Faghri, R. H. Campbell, C. Zhai, M. J. Efron, R. Iyer, M. C. Schatz, S Sinha, and G.E. Robinson, "Big Data: Astronomical or Genomical?," PLoS Biol, vol. 13, no. 7, e1002195, 2015. crossref(new window)

M. Kohl, Introduction to statistical data analysis with R,, London, 2015.

J. Ye, T. Li, T. Xiong, and R. Janardan, "Using uncorrelated discriminant analysis for tissue classification with gene expression data," IEEE/ACM transactions on computation biology and bioinformatics, vol. 1, no. 4, pp. 181-190, 2004. crossref(new window)