A Study on Gene Search Using Test for Interval Data

구간형 데이터 검정법을 이용한 유전자 탐색에 관한 연구

  • 이성건 (성신여자대학교 통계학과)
  • Received : 2018.11.20
  • Accepted : 2018.12.20
  • Published : 2018.12.31


The methylation score, expressed as a percentage of the methylation status data derived from the iterative sequencing process, has a value between 0 and 1. It is contrary to the assumption of normal distribution that simply applying the t-test to examine the difference in population-specific methylation scores in these data. In addition, since the result may vary depending on the number of repetitions of sequencing in the process of methylation score generation, a method that can analyze such errors is also necessary. In this paper, we introduce the symbolic data analysis and the interval K-S test method which convert observation data into interval data including uncertainty rather than one numerical data. In addition, it is possible to analyze the characteristics of methylation score by using Beta distribution without using normal distribution in the process of converting into interval data. For the data analysis, the nature of the proposed method was examined using sequencing data of actual patients and normal persons. While the t-test is only possible for the location test, it is found that the interval type K-S statistic can be used to test not only the location parameter but also the heterogeneity of the distribution function.


