ROC Curve Fitting with Normal Mixtures

정규혼합분포를 이용한 ROC 분석

Hong, Chong-Sun;Lee, Won-Yong

  • Received : 20101200
  • Accepted : 20110200
  • Published : 2011.04.30


There are many researches that have considered the distribution functions and appropriate covariates corresponding to the scores in order to improve the accuracy of a diagnostic test, including the ROC curve that is represented with the relations of the sensitivity and the specificity. The ROC analysis was used by the regression model including some covariates under the assumptions that its distribution function is known or estimable. In this work, we consider a general situation that both the distribution function and the elects of covariates are unknown. For the ROC analysis, the mixtures of normal distributions are used to estimate the distribution function fitted to the credit evaluation data that is consisted of the score random variable and two sub-populations of parameters. The AUC measure is explored to compare with the nonparametric and empirical ROC curve. We conclude that the method using normal mixtures is fitted to the classical one better than other methods.


Classification model;credit evaluation;quasi-likelihood;threshold


  1. 홍종선, 주재선, 최진수 (2010). 혼합분포에서의 최적분류점, <응용통계연구>, 23, 13-28.
  2. 홍종선, 최진수 (2009). ROC와 CAP 곡선에서의 최적분류점, <응용통계연구>, 22, 911-921.
  3. Drummond, C. and Holte, R. C. (2006). Cost curves: An improved method for visualizing classifier performance, Machine Learning, 65, 95-130.
  4. Engelmann, B., Hayden, E. and Tasche, D. (2003). Measuring the discriminative power of rating systems, Discussion Paper, Series 2: Banking and Financial Supervision.
  5. Fawcett, T. (2003). ROC Graphs: Notes and practical considerations for data mining researchers, Technical Report HPL-2003-4, HP Laboratories, 1-28.
  6. Gatsonis, C. A., Begg, C. B. and Wieand, S. A. (1995). Introduction to advances in statistical methods for diagnostic radiology: A symposium, Academic Radiology, 2, S1-3.
  7. Hanley, A. and McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristics curve, Diagnostic Radiology, 143, 29-36.
  8. McCullagh, P. and Nelder, J. A. (1983). Quasi-likelihood functions, Annals of Statistics, 11, 59-67.
  9. Pepe, M. S. (1998). Three approaches to regression analysis of receiver operating characteristic curves for continuous test results, Biometrics, 54, 124-135.
  10. Pepe, M. S. (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction, University Press, Oxford.
  11. Provost, F. and Fawcett, T. (1997). Analysis and visualization of classifier performance comparison under imprecise class and cost distributions, In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, 43-48.
  12. Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231.
  13. Sobehart, J. R. and Keenan, S. C. (2001). Measuring default accurately, credit risk special report, Risk, 14, 31-33.
  14. Stover, L., Gorga, M. P. and Neely, T. (1996). Towards optimizing the clinical utility of distortion product otoacoustic emission measurements, Journal of the Acoustical Society of America, 100, 956-967.
  15. Swets, J. A. (1988). Measuring the accuracy of diagnostic systems, American Association for the Advancement of Science, 240, 1285-1293.
  16. Swets, J. A. and Pickett, R. M. (1982). Evaluation Diagnostic Systems, Methods from Signal Detection Theory, Academic Press, New York.
  17. Tasche, D. (2006). Validation of internal rating systems and PD estimates, On-line bibliography available from: http://arXiv:physics/0606071.
  18. Zou, K. H. (2002). Receiver operating characteristic literature research, On-line bibliography available from:

Cited by

  1. ROC Function Estimation vol.24, pp.6, 2011,
  2. Alternative Optimal Threshold Criteria: MFR vol.27, pp.5, 2014,