DOI QR코드

DOI QR Code

Statistical methods for testing tumor heterogeneity

종양 이질성을 검정을 위한 통계적 방법론 연구

  • Lee, Dong Neuck (Department of Applied Statistics, Chung-Ang University) ;
  • Lim, Changwon (Department of Applied Statistics, Chung-Ang University)
  • 이동녘 (중앙대학교 응용통계학과) ;
  • 임창원 (중앙대학교 응용통계학과)
  • Received : 2018.10.25
  • Accepted : 2019.01.03
  • Published : 2019.06.30

Abstract

Understanding the tumor heterogeneity due to differences in the growth pattern of metastatic tumors and rate of change is important for understanding the sensitivity of tumor cells to drugs and finding appropriate therapies. It is often possible to test for differences in population means using t-test or ANOVA when the group of N samples is distinct. However, these statistical methods can not be used unless the groups are distinguished as the data covered in this paper. Statistical methods have been studied to test heterogeneity between samples. The minimum combination t-test method is one of them. In this paper, we propose a maximum combinatorial t-test method that takes into account combinations that bisect data at different ratios. Also we propose a method based on the idea that examining the heterogeneity of a sample is equivalent to testing whether the number of optimal clusters is one in the cluster analysis. We verified that the proposed methods, maximum combination t-test method and gap statistic, have better type-I error and power than the previously proposed method based on simulation study and obtained the results through real data analysis.

GCGHDE_2019_v32n3_331_f0001.png 이미지

Figure 3.1. Power across differences between means (k = 2). red: Gap statistic; green: minimum combination t-test; blue: maximum combination t-test.

GCGHDE_2019_v32n3_331_f0002.png 이미지

Figure 3.2. Power across differences between means (k = 3). red: Gap statistic; green: minimum combination t-test; blue: maximum combination t-test.

GCGHDE_2019_v32n3_331_f0003.png 이미지

Figure 4.1. Gap statistic across the numbers of cluster (k), patient number: 1–4.

GCGHDE_2019_v32n3_331_f0004.png 이미지

Figure 4.2. Gap statistic across the numbers of cluster (k), patient number: 5–10.

GCGHDE_2019_v32n3_331_f0005.png 이미지

Figure 4.3. Results for the real data set using gap statistic and maximum combination t-test, patient number:1, 2.

GCGHDE_2019_v32n3_331_f0006.png 이미지

Figure 4.4. Results for the real data set using gap statistic and maximum combination t-test, patient number:3, 5.

GCGHDE_2019_v32n3_331_f0007.png 이미지

Figure 4.5. Results for the real data set using gap statistic and maximum combination t-test, patient number:7–9.

Table 3.1. The probability of making a type I error

GCGHDE_2019_v32n3_331_t0001.png 이미지

Table 4.1. Results for the real data set using Gap statistic and Minimum combination t-test

GCGHDE_2019_v32n3_331_t0002.png 이미지

Table 4.2. Results for the real data set using maximum combination t-test

GCGHDE_2019_v32n3_331_t0003.png 이미지

References

  1. Baker, F. B. and Hubert L. J. (1976). A graph-theoretic approach to goodness-of-fit in complete-link hierarchical clustering, Journal of the American Statistical Association, 71, 870-878. https://doi.org/10.1080/01621459.1976.10480961
  2. Davies, D. L. and Bouldin, D. W. (1979). A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1, 224-227. https://doi.org/10.1109/TPAMI.1979.4766909
  3. Dunn, J. C. (1974). Well-separated clusters and optimal Fuzzy partitions, Journal of Cybernetics, 4, 95-104. https://doi.org/10.1080/01969727408546059
  4. Dunn, O. J. (1961). Multiple comparisons among means, Journal of the American Statistical Association, 56, 52-64. https://doi.org/10.1080/01621459.1961.10482090
  5. Fisher, R. A. (1918). The correlation between relatives on the supposition of Mendelian inheritance, Transactions of the Royal Society of Edinburgh, 52, 399-433.
  6. Hartigan, J. and Wong, M. (1979). Algorithm AS 136: A K-means clustering algorithm, Journal of the Royal Statistical Society Series C (Applied Statistics), 28, 100-108.
  7. Heo, M. and Lim, C. (2017). A minimum combination t-test method for testing differences in population means based on a group of samples of size one, The Korean Journal of Applied Statistics, 30, 301-309. https://doi.org/10.5351/KJAS.2017.30.2.301
  8. Kruskal, W. H. and Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis, Journal of the American Statistical Association, 47, 583-621. https://doi.org/10.1080/01621459.1952.10483441
  9. Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., and Hornik, K. (2018). Cluster: Cluster analysis basics and extensions, R package version 2.0.7-1.
  10. Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, Journal of Computational and Applied Mathematics, 20, 53-65. https://doi.org/10.1016/0377-0427(87)90125-7
  11. Student (1908). The probable error of a mean, Biometrika, 6, 1-25. https://doi.org/10.1093/biomet/6.1.1
  12. Thorndike, R. L. (1953). Who belongs in the family?, Psychometrika, 18, 267. https://doi.org/10.1007/BF02289263
  13. Tibshirani, R., Walther, G., and Hastie, T. (2001). Estimating the number of data clusters via the gap statistic, Journal of the Royal Statistical Society, 63, 411-423. https://doi.org/10.1111/1467-9868.00293
  14. Yan, M. and Ye, K. (2007). Determining the number of clusters using the weighted gap statistic, Biometrics, 63, 1031-1037. https://doi.org/10.1111/j.1541-0420.2007.00784.x
  15. Yoo, J., Kim, Y., Lim, C., Heo, M., Hwang, I., and Chong, S. (2017). Assessment of Spatial Tumor Het-erogeneity using CT Phenotypic Features Estimated by Semi-Automated 3D CT Volumetry of Multiple Pulmonary Metastatic Nodules: A Preliminary Study, unpublished manuscript.