Predicting Gross Box Office Revenue for Domestic Films

  • Song, Jongwoo (Department of Statistics, Ewha Womans University) ;
  • Han, Suji (Department of Statistics, Ewha Womans University)
  • Received : 2013.04.16
  • Accepted : 2013.07.08
  • Published : 2013.07.31


This paper predicts gross box office revenue for domestic films using the Korean film data from 2008-2011. We use three regression methods, Linear Regression, Random Forest and Gradient Boosting to predict the gross box office revenue. We only consider domestic films with a revenue size of at least KRW 500 million; relevant explanatory variables are chosen by data visualization and variable selection techniques. The key idea of analyzing this data is to construct the meaningful explanatory variables from the data sources available to the public. Some variables must be categorized to conduct more effective analysis and clustering methods are applied to achieve this task. We choose the best model based on performance in the test set and important explanatory variables are discussed.


Supported by : National Research Foundation of Korea(NRF)


  1. Breiman, L. (2001). Random Forests, Machine Learning, 45, 5-32.
  2. Friedman, J. H. (1999a). Greedy Function Approximation: A Gradient Boosting Machine, Stanford University,
  3. Friedman, J. H. (1999b). Stochastic Gradient Boosting, Standford University,
  4. Neelamegham, R. and Chintagunta, P. (1999). A Bayesian model to forecast new product performance in domestic and international markets, Marketing Science, 18, 115-136.
  5. R Core Team (2013). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing.
  6. Sawhney, M. S. and Eliashberg, J. (1996). A parsimonious model for forecasting gross Box-Office revenues of motion pictures, Marketing Science, 15, 112-131.
  7. Sharda, R. and Delen, D. (2006). Predicting box-office success of motion pictures with neural net-works, Expert Systems with Applications, 30, 243-254
  8. Terry, N., Butler, M. and De'Armond, D. (2003). Determinants of the Box Office performance of motion pictures, Proceedings of the Academy of Marketing Studies, bf 8, 23-28.
  9. Vany, A. D. and Walls, W. D. (1996). Bose-Einstein dynamics and adaptive contracting in the motion picture industry, The Economic Journal, 106, 1493-1514.
  10. Vany, A. D. and Walls, W. D. (1999). Uncertainty in the movie industry: Does star power reduce the terror of the Box Office?, Journal of Cultural Economics, 23, 285-318.

Cited by

  1. Pre-production forecasting of movie revenues with a dynamic artificial neural network vol.42, pp.6, 2015,