DOI QR코드

DOI QR Code

Analysis of Horse Races: Prediction of Winning Horses in Horse Races Using Statistical Models

서울 경마 경기 우승마 예측 모형 연구

Choe, Hyemin;Hwang, Nayoung;Hwang, Chankyoung;Song, Jongwoo
최혜민;황나영;황찬경;송종우

  • Received : 2015.09.14
  • Accepted : 2015.10.19
  • Published : 2015.12.31

Abstract

The Horse race industry has the largest proportion of the domestic legal gambling industry. However, there is limited statistical analysis on horse races versus other sports. We propose prediction models for winning horses in horse races using data mining techniques such as logistic regression, linear regression, and random forest. Horse races data are from the Korea Racing Authority and we use horse racing reports, information of racehorses, jockeys, and horse trainers. We consider two models based on ranks and time records. The analysis results show that prediction of ranks is affected by information on racehorses, number of wins of racehorses and jockeys. We place wagers for the last month of races based on our prediction models that produce serious profits.

Keywords

horse race;linear regression;stepwise regression;random forest;logistic regression;important variables

References

  1. Breiman, L. (2001). Random forests, Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
  2. Hastie, T. J. and Pregibon, D. (1992). Generalized Linear Models, Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
  3. McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 37, CRC press.
  4. Park, C., Kim, Y., Kim, J., Song, J. and Choi, H. (2011). Datamining using R, Kyowoo, Seoul.
  5. R Development Core Team (2010). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0. http://www.R-project.org
  6. Statistics Korea e-National indicators (2015). http://www.index.go.kr/potal/main/EachDtlPageDetail.do?idxcd=1662
  7. The Korea Racing Authority (2014). http://www.kra.co.kr/main.do
  8. The National Gambling Control Commission (2015). http://static.ngcc.go.kr/user/index.jsp
  9. The National Gambling Control Commission (2014). http://www.ngcc.go.kr/Board/ReadView.do?idx=pds&page=1&no=9346
  10. Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, Springer, New York.
  11. Yoo, S. and Park, H. (2000). The horse race winning probability via logistic regression, Korean Journal of Applied Statistics, 13, 35-44.

Acknowledgement

Supported by : National Research Foundation of Korea (NRF)