Advanced SearchSearch Tips
A Prediction Model for the Development of Cataract Using Random Forests
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
A Prediction Model for the Development of Cataract Using Random Forests
Han, Eun-Jeong; Song, Ki-Jun; Kim, Dong-Geon;
  PDF(new window)
Cataract is the main cause of blindness and visual impairment, especially, age-related cataract accounts for about half of the 32 million cases of blindness worldwide. As the life expectancy and the expansion of the elderly population are increasing, the cases of cataract increase as well, which causes a serious economic and social problem throughout the country. However, the incidence of cataract can be reduced dramatically through early diagnosis and prevention. In this study, we developed a prediction model of cataracts for early diagnosis using hospital data of 3,237 subjects who received the screening test first and then later visited medical center for cataract check-ups cataract between 1994 and 2005. To develop the prediction model, we used random forests and compared the predictive performance of this model with other common discriminant models such as logistic regression, discriminant model, decision tree, naive Bayes, and two popular ensemble model, bagging and arcing. The accuracy of random forests was 67.16%, sensitivity was 72.28%, and main factors included in this model were age, diabetes, WBC, platelet, triglyceride, BMI and so on. The results showed that it could predict about 70% of cataract existence by screening test without any information from direct eye examination by ophthalmologist. We expect that our model may contribute to diagnose cataract and help preventing cataract in early stages.
Random forest;screening test;prediction model of cataracts;accuracy;sensitivity;
 Cited by
의사결정나무기법을 이용한 노인장기요양보험 등급결정모형 개발,한은정;곽민정;강임옥;

응용통계연구, 2011. vol.24. 1, pp.145-159 crossref(new window)
국민건강보험공단.건강보험심사평가원 (2007), 2006 건강보험통계연보

신경환, 김재찬, 김원식, 안병헌, 이진학, 노세현, 송준경, 이용환 (1992a). 한국 백내장 역학 조사회에 의한 노인성 백내장의 제반 위험 인자에 관한 연구 조사, <대한안과학회지>, 33, 127-134

신경환, 홍내선, 안상기, 김재찬, 이진학, 안병헌, rlaakst, 노세현, 송준경 (1992b). 노인성 백내장의 위험인자 및 환경요소에 대한 역학적 연구: 인구를 기초로 한 역학 조사, <대한안과학회지>, 33, 834-843

통계청 (2008). <2008 고령자 통계>, 통계청, 서울

Bauer, E. and Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants, Machine Learning, 36, 105-139 crossref(new window)

Breiman, L. (2001). Random forest, Machine Learning, 45, 5-32 crossref(new window)

Bureau, A., Dupuis, J., Falls, K, Lunetta, K. L., Hayward, B., Keith, T. P. and Van Eerdewegh, P. (2005). Identifying SNPs predictive of phenotype using random forests, Genetic Epidemiology, 28, 171-182 crossref(new window)

Delcourt, C., Cristol, J. P., Tessier, F., Leger, C. L., Michel. F. and Papoz, L. (2000). Risk factors for cortical, nuclear, and posterior subcapsular cataracts: The POLA study, American Journal of Epidemiology, 151, 497-504 crossref(new window)

Elkan, C. (2001). The foundations of cost-sensitive learning, In Proceedings of the Seventeenth International Joint Conference on Artijiciallntelligence(IJCAI'01), 973-978

Heidema, A. G., Boer, J. M. A., Nagelkerke, N., Mariman, E. C. M., van der A, D. L. and Feskens, E. J. M. (2006). The challenge for genetic epidemiologists: How to analyze large numbers of SNPs in relation to complex disease, BMC Genetics, 1, 23 crossref(new window)

Hennis, A., Wu, S. Y., Nemesure, B. and Leske, M. C. (2004). Risk factors for incident cortical and posterior subcapsular lens opacities in the Barbados Eye Studies, Arch Ophthalmol, 122, 525-530 crossref(new window)

Kuang, T. M., Tsai, S. Y., Hsu, W. M., Cheng, C. Y., Liu, J. H. and Chou, P. (2005). Body mass index and age-related cataract: The Shihpai Eye Study, Archives of Ophthalmol, 123, 1109-1114 crossref(new window)

Lunetta, K. L., Hayward, L. B., Segal, J. and Van Eerdewegh, P. (2004). Screening Large-scale association study data: Exploiting interactions using random forests, BMC Genentics, 5, 32 crossref(new window)

Panchapakesan, J., Mitchell, P., Tumuluri, K., Rochtchina, E., Foran, S. and Cumming, R, G. (2003). Five year incidence of cataract surgery: The blue mountains eye study, British Journal of Ophthalmology, 87, 168-172 crossref(new window)

Prasad, A. M., Iverson, L. R. and Liaw, A. (2006). Newer classification and regression tree techniques: Bagging and random forests for ecological prediction, Ecosystems, 9, 181-199 crossref(new window)

Robnik-Sikonja, M. (2004). Improving Random Forests, Lecture Notes in Computer Science, Springer, 359-370

Strobl, C, Boulesteix, A. L., Zeileis, A. and Hothorn, T. (2007). Bias in random forest variable importance measures: Illustrations, sources and a solution, BMC Bioinformatics, 8, 25 crossref(new window)

Tibshirani, R. (1996). Bias, Variance and Prediction Error for Classification Rules, Technical Report, Statistics Department, University of Toronto

Weintraub, J. M., Willett, W. C, Rosner, B., Colditz, G. A., Seddon, J. M. and Hankinson, S, E. (2002). A prospective study of the relationship between body mass index and cataract extraction among US women and men, International Journal of Obesity, 26, 1588-1595 crossref(new window)

Wolpert, D. H. and Macready, W. G. (1999). An efficient method to estimate Bagging's generalization error, Machine Learning, 35, 41-55 crossref(new window)