- Volume 23 Issue 6
Knowing what factors into a player's ability to affect the outcome of a sports game is crucial. This knowledge helps determine the relative degree of contribution by each team member as well as sets appropriate annual salaries. This study uses statistical analysis to investigate how much the outcome of a professional baseball game is influenced by the records of individual players. We used the Lotte Giants' data on 252 games played between 2007 and 2008 that included environmental data(home or away games and opponents) as well as pitchers' and batters' data. Using a SAS Enterprise Miner, we performed a logistic regression analysis and decision tree analysis on the data. The results obtained through the two analytic methods are compared and discussed.
Decision tree analysis;logistic regression analysis;odds ratio;SAS Enterprise Miner
- Albert, J. (1994). Exploring baseball hitting data: What about those breakdown statistics?, Journal of the American Statistical Association, 89, 1066–1074.
- Albright, S. C. (1993). A statistical analysis of hitting streaks in baseball, Journal of the American Statistical Association, 88, 1175–1183.
- 323–336.Barry, D. and Hartigan, J. A. (1994). Change points in 0–1 sequences, with an application to predicting divisional winners in major league baseball, Journal of Applied Statistical Science, 1, 323–336.
- Bennett, J.M. (1998). "Baseball" in Statistics in Sport, Arnold Applications of Statistics Series, Arnold Publishing, 25–64.
- Cho, Y. S. and Cho, Y. J. (2003). The research regarding a Beane Count application from Korean baseball league, Journal of the Korean Data Analysis Society, 5, 649–658.
- Cho, Y. S., Cho, Y. J. and Shin, S. K. (2007). A Study on winning and losing in Korean Professional Baseball League, Journal of the Korean Data Analysis Society, 9, 501–510.
- Choi, Y. S. and Shim, H. J. (1995). Applications of the supplementary principal component analysis for the 1982–1992 Korean pro baseball data, The Korean Journal of Applied Statistics, 8, 1051–1060.
- Han, J. and Kamber, M. (2001). Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, San Francisco.
- Hong, C. S. and Choi, J. M. (2008). Steal success model for 2007 Korean professional baseball games, The Korean Journal of Applied Statistics, 21, 455–468. https://doi.org/10.5351/KJAS.2008.21.3.455
- James, B., Albert, J. and Stern, H. S. (1993). Answering questions about baseball using statistics, Chance, 6, 17–22.
- Kim, H. J. (2004). Are the criteria for determining team ranking in Korean professional baseball and football reasonable from a statistical viewpoint?, Journal of the Korean Data Analysis Society, 6, 1767–1775.
- Lee, J. T. and Cho, H. S. (2009). An analysis on the home-field advantage in Korea professional baseball with logistic regression model, Journal of the Korean Data Analysis Society, 11, 533–543.
- Lee, J. T. and Kim Y. T. (2006a). A study on the estimation of winning percentage in Korean pro-baseball, Journal of the Korean Data Analysis Society, 8, 857–869.
- Lee, J. T. and Kim Y. T. (2006b). Estimation of winning percentage in Korean pro-sports, Journal of the Korean Data Analysis Society, 8, 2105–2116.
- Lee, J. T. and Kim Y. T. (2007). An effective statistical model that predicted winning percentage in Korean pro-baseball, Journal of the Korean Data Analysis Society, 9, 931–942.
- Lindsey, G. (1963). An investigation of strategies in baseball, Operations Research, 11, 447–501.
- Park, K. C. (2008). http://www.sports2i.com.
- Shin, S. K., Park, K. C., Cho, Y. S. and Choi, S. H. (2007). A study on analyzing factors affecting the outcome of Korean professional baseball games: A case of Samsung Lions, Journal of the Korean Data Analysis Society, 9, 2071–2083.