Parameter estimation of linear function using VUS and HUM maximization

- Journal title : Journal of the Korean Data and Information Science Society
- Volume 26, Issue 6, 2015, pp.1305-1315
- Publisher : Korean Data and Information Science Society
- DOI : 10.7465/jkdi.2015.26.6.1305

Title & Authors

Parameter estimation of linear function using VUS and HUM maximization

Hong, Chong Sun; Won, Chi Hwan; Jeong, Dong Gil;

Hong, Chong Sun; Won, Chi Hwan; Jeong, Dong Gil;

Abstract

Consider the risk score which is a function of a linear score for the classification models. The AUC optimization method can be applied to estimate the coefficients of linear score. These estimates obtained by this AUC approach method are shown to be better than the maximum likelihood estimators using logistic models under the general situation which does not fit the logistic assumptions. In this work, the VUS and HUM approach methods are suggested by extending AUC approach method for more realistic discrimination and prediction worlds. Some simulation results are obtained with both various distributions of thresholds and three kinds of link functions such as logit, complementary log-log and modified logit functions. It is found that coefficient prediction results by using the VUS and HUM approach methods for multiple categorical classification are equivalent to or better than those by using logistic models with some link functions.

Keywords

Discrimination;link;manifold;risk;score;surface;threshold;

Language

Korean

Cited by

References

1.

Bamber, D. C. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology, 12, 387-415.

2.

Cavanagh, C. and Sherman, R. P. (1998). Rank estimators for monotonic index models. Journal of Econometrics, 84, 351-381.

3.

Dreiseitl, S., Ohno-Machado, L. and Binder, M. (2000). Comparing three-class diagnostic tests by three-way ROC analysis. Medical Decision Making, 20, 323-331.

4.

Egan, J. P. (1975). Signal detection theory and ROC analysis, Academic Press, New York.

5.

Engelmann, B., Hayden, E. and Tasche, D. (2003). Measuring the discriminative power of rating systems. Risk, 82-86.

6.

Fawcett, T. (2003). ROC graphs: Notes and practical considerations for data mining researchers, HP Labs Technical Report HPL-2003-4, CA, USA.

7.

Han, A. K. (1987). Non-parametric analysis of a generalized regression model, The maximum rank correlation estimator. Journal of Economics, 35, 303-316.

8.

Heckerling, P. S. (2001). Parametric three-way receiver operating characteristic surface analysis using mathematica. Medical Decision Making, 21, 409-417.

9.

Hong, C. S. and Cho, M. H. (2015). Two optimal threshold criteria for ROC analysis. Journal of the Korean Data & Information Science Society, 26, 255-260.

10.

Hong, C. S. and Choi, J. S. (2009). Optimal threshold from ROC and CAP curves. The Korean Journal of Applied Statistics, 22, 911-921.

11.

Hong, C. S., Joo, J. S. and Choi, J. S. (2010). Optimal thresholds from mixture distributions. The Korean Journal of Applied Statistics, 23, 13-28.

12.

Hong, C. S., Jung, E. S. and Jung, D. G. (2013). Standard criterion of VUS for ROC surface. The Korean Journal of Applied Statistics, 26, 1-8.

13.

Hong, C. S. and Jung, D. G. (2014). Standard criterion of hypervolume under the ROC manifold. Journal of the Korean Data & Information Science Society, 25, 473-483.

14.

Hosmer, D. W. (2000). Applied logistic regression, 2nd ed., Wiley, New York.

15.

Joseph, M. P. (2005). A PD validation framework for Basel II internal ratings-based systems. Quantitative Analyst Basel II Project, Commonwealth Bank of Australia.

16.

Kraus, A. (2014). Recent methods from statistics and machine learning for credit scoring, Dissertation an der Fakultat fur Mathematik, Informatik und Statistik, der Ludwig-Maximilians-Universitat Munchen, Munchen; http://edoc.ub.uni-muenchen.de/17143/1/Kraus_Anne.pdf.

17.

Li, J. and Fine, J. P. (2008). ROC analysis with multiple classes and multiple tests: Methodology and its application in microarray studies. Biostatistics, 9, 566-576.

19.

Nakas, C. T., Alonzo, T. A. and Yiannoutsos, C. T. (2010). Accuracy and cut off point selection in three class classification problems using a generalization of the Youden index. Statistics in Medicine, 29, 2946-2955.

20.

Nakas, C. T. and Yiannoutsos, C. T. (2004). Ordered multiple-class ROC analysis with continuous measurements. Statistics in Medicine, 23, 3437-3449.

21.

Nelder, J. A. and Mead, R. (1965). A simplex method for function minimization. The Computer Journal, 7, 308-313.

22.

Patel, A. C. and Markey, M. K. (2005). Comparison of three-class classification performance metrics: A case study in breast cancer CAD. International Society for Optical Engineering, 5749, 581-589.

23.

Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction, Oxford UniversityPress, Oxford.

24.

Pepe, M. S., Cai, T. and Longton, G. (2005). Combining predictors for classification using the area under the receiver operating characteristic curve. Biometrics, 1, 221-229.

25.

Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42, 203-231.

26.

Scurfield, B. K. (1996). Multiple-event forced-choice tasks in the theory of signal detectability. Journal of Mathematical Psychology, 40, 253-269.

27.

Sherman, R. P. (1993). The limiting distribution of the maximum rank correlation estimator. Econometrics, 61, 123-137.

28.

Sobehart, J. R. and Keenan, S. C. (2001). Measuring default accurately, Credit risk special report. Risk, 14, 31-33.

30.

Swets, J. A., Dawes, R. M. and Monahan, J. (2000). Better decisions through science. Scientific American, 283, 82-87.