Study on Effects of Population Stratification on Haplotype Trend Test in Case-Control Studies

환자-대조군 연구에서 인구집단 층화가 일배체형 경향성 검정에 미치는 영향

Kim, Jin-Heum;Kang, Dae-Ryong;Lim, Hyun-Sun;Nam, Chung-Mo

  • Published : 2009.10.31


Population stratification can cause spurious associations between genetic markers and disease locus. In order to handle this population stratification in haplotype-based case-control association studies, we added population indicators as covariates to the haplotype trend regression model proposed by Zaykin et al. (2002). We investigated through simulations how both population stratification and measurement error in the estimation of true population of each individual affect type I error probabilities of the association tests based on both Zaykin et al.'s (2002) model and the proposed model. Based on those results, in the situation that there exists population stratification but there is no error in population classification of each individual, our proposed model does satisfy a type I error probability whereas Zaykin et al.'s (2002) model does not. However, as the measurement error increases, a type I error probability of our model correspondingly becomes larger than a nominal significance level. It implies that as long as uncertainty in the estimation of true population of each individual still remains, it is nearly impossible to avoid false positive in case-control association studies based on haplotypes.


Population stratification;spurious association;false positive;haplotype trend test;measurement error


  1. Armitage, P. (1955). Tests for linear trends in proportions and frequencies, Biometrics, 11, 375-386
  2. Devlin, B. and Roeder, K. (1999). Genomic control for association studies, Biometrics, 55, 997-1004
  3. Epstein, M. P. and Satten, G. A. (2003). Inference on haplotype effects in case-control studies using unphased genotype data, The American Journal of Human Genetics, 73, 1316-1329
  4. Excoffier, L. and Slatkin, M. (1995). Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population, Molecular Biology and Evolution, 12, 921-927
  5. Fallin, D., Cohen, A., Essioux, L., Chumakov, I., Blumenfeld, M., Cohen, D. and Schork, N. J. (2001). Genetic analysis of case/control data using estimated haplotype frequencies: Application to APOE locus variation and alzheimer's disease, Genome Research, 11, 143-151
  6. Haggart, C J., Parra, E. J., Shriver, M. D., Bonilla, C, Kittles, R. A., Clayton, D. G. and McKeigue, P. M. (2003). Control of confounding of genetic associations in stratified populations, The American Journal of Human Genetics, 72, 1492-1504
  7. Jorde, L. B. (1995). Linkage disequilibrium as a gene-mapping tool, The American Journal of Human Genetics, 56, 11-14
  8. Keavney, B. (2002). Genetic epidemiological studies of coronary heart disease, International Journal of Epidemiology, 31, 730-736
  9. Kim, J., Kang, D. R., Lee, Y. K., Shin, S. M., Suh, I. and Nam, C M. (2004). Statistical algorithm in genetic linkage based on haplotypes, Journal of Preventive Medicine and Public Health, 37, 366-372
  10. Long, J. C, Williams, R. C and Urbanek, M. (1995). An E-M algorithm and testing strategy for multiple-locus haplotypes, The American Journal of Human Genetics, 56, 799-810
  11. Nielsen, D. M. and Weir, B. S. (1999). A classical setting for associations between markers and loci affecting quantitative traits, Genetical Research, 74, 271-277
  12. Pritchard, J. K., Stephens, M. and Donnelly, P. (2000a). Inference of population structure using multilocus genotype data, Genetics, 155, 945-959
  13. Pritchard, J. K., Stephens, M., Rosenberg, N. A. and Donnelly, P. (2000b). Association mapping in structured populations, The American Journal of Human Genetics, 67, 170-181
  14. SAS Institute. (2002). SAS/Genetics User's Guide, SAS Institute, Cary
  15. Sasieni, P. D. (1997). From genotypes to genes: Doubling the sample size, Biometrics, 53, 1253-1261
  16. Satten, G. A., Flanders, W. D. and Yang, Q. (2001). Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model, The American Journal of Human Genetics, 68, 466-477
  17. Schaid, D. J., Rowland, C. M., Tines, D. E., Jacobson, R. M. and Poland, G. A. (2002). Score tests for association between traits and haplotypes when linkage phase is ambiguous, The American Journal of Human Genetics, 70, 425-434
  18. Setakis, E., Stirnadel, H. and Balding, D. J. (2006). Logistic regression protects against population structure in genetic association studies, Genome Research, 16, 290-296
  19. Tanck, M. W. T., Klerkx, A. H. E. M., Jukema, J. W., De Knijff, P., Kastelein, J. J. P. and Zwinderman, A. H. (2003). Estimation of multilocus haplotype effects using weighted penalised log-likelihood: Analysis of five sequence variations at the cholesteryl ester transfer protein gene locus, Annals of Human Genetics, 67, 175-184
  20. Terwilliger, J. and Ott, J. (1994). Handbook of Human Genetic Linkage, Johns Hopkins University Press, Baltimore
  21. Zaykin, D. V., Westfall, P. H., Young, S. 5., Karnoub, M. A., Wagner, M. J. and Ehm, M. G. (2002). Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals, Human Heredity, 53, 79-91
  22. Zhao, J. H., Curtis, D. and Sham, P. C. (2000). Model-free analysis and permutation tests for allelic associations, Human Heredity, 50, 133-139
  23. Zhu, X., Zhang,S., Zhao, H. and Cooper, R. S. (2002). Association mapping, using a mixture model for complex traits, Genetic Epidemiology, 23, 181-196