Parameter estimation for the imbalanced credit scoring data using AUC maximization

AUC 최적화를 이용한 낮은 부도율 자료의 모수추정

Hong, C.S.;Won, C.H.

  • Received : 2015.11.02
  • Accepted : 2016.01.05
  • Published : 2016.02.29


For binary classification models, we consider a risk score that is a function of linear scores and estimate the coefficients of the linear scores. There are two estimation methods: one is to obtain MLEs using logistic models and the other is to estimate by maximizing AUC. AUC approach estimates are better than MLEs when using logistic models under a general situation which does not support logistic assumptions. This paper considers imbalanced data that contains a smaller number of observations in the default class than those in the non-default for credit assessment models; consequently, the AUC approach is applied to imbalanced data. Various logit link functions are used as a link function to generate imbalanced data. It is found that predicted coefficients obtained by the AUC approach are equivalent to (or better) than those from logistic models for low default probability - imbalanced data.




  1. Allison, P. D. (2008). Convergence failures in logistic regression, In SAS Global Forum, 360, 1-11.
  2. Bamber, D. C. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph, Journal of Mathematical Psychology, 12, 387-415.
  3. Brown, I. and Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, 39, 3446-3453.
  4. Burr, I. W. (1942). Cumulative frequency functions, The Annals of Mathematical Statistics, 13, 215-232.
  5. Calabrese, R. and Osmetti, S. A. (2011). Generalized extreme value regression for binary rare events data: an application to credit defaults, Bulletin of the International Statistical Institute LXII, 58th Session of the International Statistical Institute, 5631-5634.
  6. Cavanagh, C. and Sherman, R. P. (1998). Rank estimators for monotonic index models, Journal of Econometrics, 84, 351-381.
  7. Dreiseitl, S., Ohno-Machado, L., and Binder, M. (2000). Comparing three-class diagnostic tests by three-way ROC analysis, Medical Decision Making, 20, 323-331.
  8. Egan, J. P. (1975). Signal Detection Theory and ROC Analysis, Academic Press, New York.
  9. Engelmann, B., Hayden, E., and Tasche, D. (2003). Measuring the discriminative power of rating systems, Risk, 82-86.
  10. Fawcett, T. (2003). ROC graphs: Notes and practical considerations for data mining researchers, HP Labs Technical Report HPL-2003-4, CA, USA.
  11. Han, A. K. (1987). Non-parametric analysis of a generalized regression model, the maximum rank correlation estimator, Journal of Economics, 35, 303-316.
  12. Heckerling, P. S. (2001). Parametric three-way receiver operating characteristic surface analysis using mathematica, Medical Decision Making, 21, 409-417.
  13. Hong, C. S. and Cho, M. H. (2015a). VUS and HUM represented with Mann-Whitney statistic, Communications for Statistical Applications and Methods, 22, 223-232.
  14. Hong, C. S. and Cho, M. H. (2015b). Test statistics for volume under the ROC surface and hypervolume under the ROC manifold, Communications for Statistical Applications and Methods, 22, 377-387.
  15. Hong, C. S. and Choi, J. S. (2009). Optimal threshold from ROC and CAP curves, The Korean Journal of Applied Statistics, 22, 911-921.
  16. Hong, C. S., Joo, J. S., and Choi, J. S. (2010). Optimal thresholds from mixture distributions, The Korean Journal of Applied Statistics, 23, 13-28.
  17. Hong, C. S. and Jung, D. G. (2014). Standard criterion of hypervolume under the ROC manifold, Journal of the Korean Data & Information Science Society, 25, 473-483.
  18. Hong, C. S. and Jung, E. S. (2013). Optimal thresholds criteria for ROC surfaces, Journal of The Korean Data and Information Science Society, 24, 1489-1496.
  19. Hong, C. S., Jung, E. S., and Jung, D. G. (2013). Standard criterion of VUS for ROC surface, The Korean Journal of Applied Statistics, 26, 1-8.
  20. Hong, C. S., Won, C. H., and Jeong, D. G. (2015). Parameter estimation of linear function using VUS and HUM maximization, Journal of the Korean Data & Information Science Society, To appear.
  21. Hong, C. S. and Wu, Zhi Qiang (2014). Alternative accuracy for multiple ROC analysis, Journal of The Korean Data & Information Science Society, 25, 1521-1530.
  22. Hosmer, D. W. (2000). Applied Logistic Regression, 2nd ed., Wiley, New York.
  23. Joseph, M. P. (2005). A PD validation framework for Basel II internal ratings-based systems, Quantitative Analyst Basel II Project, Commonwealth Bank of Australia.
  24. Kraus, A. (2014). Recent Methods from Statistics and Machine Learning for Credit Scoring, Dissertation an der Fakultat fur Mathematik, Informatik und Statistik, der Ludwig-Maximilians-Universitat Munchen, Munchen; Anne.pdf.
  25. Li, J. and Fine, J. P. (2008). ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies, Biostatistics, 9, 566-576.
  26. Mossman, D. (1999). Three-way ROCs, Medical Decision Making, 19, 78-89.
  27. Nakas, C. T., Alonzo, T. A., and Yiannoutsos, C. T. (2010). Accuracy and cut off point selection in three class classification problems using a generalization of the Youden index, Statistics in Medicine, 29, 2946-2955.
  28. Nakas, C. T. and Yiannoutsos, C. T. (2004). Ordered multiple-class ROC analysis with continuous measurements, Statistics in Medicine, 23, 3437-3449.
  29. Nelder, J. A. and Mead, R. (1965). A simplex method for function minimization, The Computer Journal, 7, 308-313.
  30. Patel, A. C. and Markey, M. K. (2005). Comparison of three-class classification performance metrics: A case study in breast cancer CAD, International Society for Optical Engineering, 5749, 581-589.
  31. Pepe, M. S. (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford UniversityPress, Oxford.
  32. Pepe, M. S., Cai, T., and Longton, G. (2005). Combining predictors for classification using the area under the receiver operating characteristic curve, Biometrics, 1, 221-229.
  33. Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231.
  34. Scurfield, B. K. (1996). Multiple-event forced-choice tasks in the theory of signal detectability, Journal of Mathematical Psychology, 40, 253-269.
  35. Sherman, R. P. (1993). The limiting distribution of the maximum rank correlation estimator, Econometrics, 61, 123-137.
  36. Sobehart, J. R. and Keenan, S. C. (2001). Measuring default accurately, Credit risk special report, Risk, 14, 31-33.
  37. Swets, J. (1988). Measuring the accuracy of diagnostic systems, Science, 240, 1285-1293.
  38. Swets, J. A., Dawes, R. M., and Monahan, J. (2000). Better decisions through science, Scientific American, 283, 82-87.
  39. Tasche, D. (2009). Estimating discriminatory power and PD curves when the number of defaults is small, Lioyds Banking Group.
  40. Wandishin, M. S. and Mullen, S. J. (2009). Multiclass ROC analysis, Weather and Forecasting, 24, 530-547.
  41. Zou, K. H., O'Malley, A. J., and Mauri, L. (2007). Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models, Circulation, 115, 654-657.