• Title, Summary, Keyword: area under the curve (AUC)

Search Result 476, Processing Time 0.044 seconds

Partial AUC using the sensitivity and specificity lines (민감도와 특이도 직선을 이용한 부분 AUC)

  • Hong, Chong Sun;Jang, Dong Hwan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.5
    • /
    • pp.541-553
    • /
    • 2020
  • The receiver operating characteristic (ROC) curve is expressed as both sensitivity and specificity; in addition, some optimal thresholds using the ROC curve are also represented with both sensitivity and specificity. In addition to the sensitivity and specificity, the expected usefulness function is considered as disease prevalence and usefulness. In particular, partial the area under the ROC curve (AUC) on a certain range should be compared when the AUCs of the crossing ROC curves have similar values. In this study, partial AUCs representing high sensitivity and specificity are proposed by using sensitivity and specificity lines, respectively. Assume various distribution functions with ROC curves that are crossing and AUCs that have the same value. We propose a method to improve the discriminant power of the classification models while comparing the partial AUCs obtained using sensitivity and specificity lines.

Bayesian hierarchical model for the estimation of proper receiver operating characteristic curves using stochastic ordering

  • Jang, Eun Jin;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.205-216
    • /
    • 2019
  • Diagnostic tests in medical fields detect or diagnose a disease with results measured by continuous or discrete ordinal data. The performance of a diagnostic test is summarized using the receiver operating characteristic (ROC) curve and the area under the curve (AUC). The diagnostic test is considered clinically useful if the outcomes in actually-positive cases are higher than actually-negative cases and the ROC curve is concave. In this study, we apply the stochastic ordering method in a Bayesian hierarchical model to estimate the proper ROC curve and AUC when the diagnostic test results are measured in discrete ordinal data. We compare the conventional binormal model and binormal model under stochastic ordering. The simulation results and real data analysis for breast cancer indicate that the binormal model under stochastic ordering can be used to estimate the proper ROC curve with a small bias even though the sample sizes were small or the sample size of actually-negative cases varied from actually-positive cases. Therefore, it is appropriate to consider the binormal model under stochastic ordering in the presence of large differences for a sample size between actually-negative and actually-positive groups.

A Comparison of the Interval Estimations for the Difference in Paired Areas under the ROC Curves (대응표본에서 AUC차이에 대한 신뢰구간 추정에 관한 고찰)

  • Kim, Hee-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.2
    • /
    • pp.275-292
    • /
    • 2010
  • Receiver operating characteristic(ROC) curves can be used to assess the accuracy of tests measured on ordinal or continuous scales. The most commonly used measure for the overall diagnostic accuracy of diagnostic tests is the area under the ROC curve(AUC). When two ROC curves are constructed based on two tests performed on the same individuals, statistical analysis on differences between AUCs must take into account the correlated nature of the data. This article focuses on confidence interval estimation of the difference between paired AUCs. We compare nonparametric, maximum likelihood, bootstrap and generalized pivotal quantity methods, and conduct a monte carlo simulation to investigate the probability coverage and expected length of the four methods.

Optimization of Classifier Performance at Local Operating Range: A Case Study in Fraud Detection

  • Park Lae-Jeong;Moon Jung-Ho
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.5 no.3
    • /
    • pp.263-267
    • /
    • 2005
  • Building classifiers for financial real-world classification problems is often plagued by severely overlapping and highly skewed class distribution. New performance measures such as receiver operating characteristic (ROC) curve and area under ROC curve (AUC) have been recently introduced in evaluating and building classifiers for those kind of problems. They are, however, in-effective to evaluation of classifier's discrimination performance in a particular class of the classification problems that interests lie in only a local operating range of the classifier, In this paper, a new method is proposed that enables us to directly improve classifier's discrimination performance at a desired local operating range by defining and optimizing a partial area under ROC curve or domain-specific curve, which is difficult to achieve with conventional classification accuracy based learning methods. The effectiveness of the proposed approach is demonstrated in terms of fraud detection capability in a real-world fraud detection problem compared with the MSE-based approach.

Learning Behavior Analysis of Bayesian Algorithm Under Class Imbalance Problems (클래스 불균형 문제에서 베이지안 알고리즘의 학습 행위 분석)

  • Hwang, Doo-Sung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.6
    • /
    • pp.179-186
    • /
    • 2008
  • In this paper we analyse the effects of Bayesian algorithm in teaming class imbalance problems and compare the performance evaluation methods. The teaming performance of the Bayesian algorithm is evaluated over the class imbalance problems generated by priori data distribution, imbalance data rate and discrimination complexity. The experimental results are calculated by the AUC(Area Under the Curve) values of both ROC(Receiver Operator Characteristic) and PR(Precision-Recall) evaluation measures and compared according to imbalance data rate and discrimination complexity. In comparison and analysis, the Bayesian algorithm suffers from the imbalance rate, as the same result in the reported researches, and the data overlapping caused by discrimination complexity is the another factor that hampers the learning performance. As the discrimination complexity and class imbalance rate of the problems increase, the learning performance of the AUC of a PR measure is much more variant than that of the AUC of a ROC measure. But the performances of both measures are similar with the low discrimination complexity and class imbalance rate of the problems. The experimental results show 4hat the AUC of a PR measure is more proper in evaluating the learning of class imbalance problem and furthermore gets the benefit in designing the optimal learning model considering a misclassification cost.

The Correlations of Parameters Using Contrast Enhanced Ultrasonography in the Evaluation of Prostate Cancer Angiogenesis (전립선암쥐모형의 신생혈관생성의 평가를 위해 시행된 역동적 조영 증강 초음파에서 얻은 변수간의 상관성연구)

  • Hwang, Sung Il;Lee, Hak Jong;Kim, Kil Joong;Chung, Jin-haeng;Jung, Hyun Sook;Jeon, Jong June
    • Ultrasonography
    • /
    • v.32 no.2
    • /
    • pp.132-142
    • /
    • 2013
  • Purpose: The purpose of this study is to investigate the correlations of various kinetic parameters derived from the time intensity curve in a xenograft mouse model injected with a prostate cancer model (PC-3 and LNCaP) using an ultrasound contrast agent with histopathologic parameters. Materials and Methods: Twenty nude mice were injected with human prostate cancer cells (15 PC-3 and five LNCaP) on their hind limbs. A bolus of $500{\mu}L$ ($1{\times}10^8$ microbubbles) of second-generation US contrast agent (SonoVue) was injected into the retroorbital vein. The region of interest was drawn over the entire tumor. The time intensity curve was acquired and then fitted to a gamma variate function. The maximal intensity (A), time to peak (Tp), maximal wash-in rate (washin), washout rate (washout), area under the curve up to 50 sec ($AUC_{50}$), area under the ascending slope ($AUC_{in}$), and area under the descending slope ($AUC_{out}$) were derived from the parameters of the gamma variate fit. Immunohistochemical staining for VEGF and CD31 was performed. Tumor volume, the area percentage of VEGF stained in a field, and the count of CD31 (microvessel density, MVD) positive vessels showed correlation with the parameters from the time intensity curve. Results: No significant differences were observed between the kinetic and histopathological parameters from each group. MVD showed positive correlation with A (r=0.625, p=0.003), washin (r=0.462, p=0.040), $AUC_{50}$ (r=0.604, p=0.005), and $AUC_{out}$ (r=0.587, p=0.007). Positive correlations were also observed between tumor volume and $AUC_{50}$ (r=0.481, p=0.032), washin (r=0.662, p=0.001), and $AUC_{out}$ (r=0.547, p=0.012). Washout showed negative correlations with MVD (r=-0.454, p=0.044) and tumor volume (r=-0.464, p=0.039). The area percentage of VEGF did not show any correlation with calculated data from the curve. Conclusion: MVD showed correlations with several of the kinetic parameters. CEUS has the potential for prediction of tumor vascularity in a prostate cancer animal model.

The Unified Framework for AUC Maximizer

  • Jun, Jong-Jun;Kim, Yong-Dai;Han, Sang-Tae;Kang, Hyun-Cheol;Choi, Ho-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.6
    • /
    • pp.1005-1012
    • /
    • 2009
  • The area under the curve(AUC) is commonly used as a measure of the receiver operating characteristic(ROC) curve which displays the performance of a set of binary classifiers for all feasible ratios of the costs associated with true positive rate(TPR) and false positive rate(FPR). In the bipartite ranking problem where one has to compare two different observations and decide which one is "better", the AUC measures the quantity that ranking score of a randomly chosen sample in one class is larger than that of a randomly chosen sample in the other class and hence, the function which maximizes an AUC of bipartite ranking problem is different to the function which maximizes (minimizes) accuracy (misclassification error rate) of binary classification problem. In this paper, we develop a way to construct the unified framework for AUC maximizer including support vector machines based on maximizing large margin and logistic regression based on estimating posterior probability. Moreover, we develop an efficient algorithm for the proposed unified framework. Numerical results show that the propose unified framework can treat various methodologies successfully.

Multivariate Outlier Removing for the Risk Prediction of Gas Leakage based Methane Gas (메탄 가스 기반 가스 누출 위험 예측을 위한 다변량 특이치 제거)

  • Dashdondov, Khongorzul;Kim, Mi-Hye
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.12
    • /
    • pp.23-30
    • /
    • 2020
  • In this study, the relationship between natural gas (NG) data and gas-related environmental elements was performed using machine learning algorithms to predict the level of gas leakage risk without directly measuring gas leakage data. The study was based on open data provided by the server using the IoT-based remote control Picarro gas sensor specification. The naturel gas leaks into the air, it is a big problem for air pollution, environment and the health. The proposed method is multivariate outlier removing method based Random Forest (RF) classification for predicting risk of NG leak. After, unsupervised k-means clustering, the experimental dataset has done imbalanced data. Therefore, we focusing our proposed models can predict medium and high risk so best. In this case, we compared the receiver operating characteristic (ROC) curve, accuracy, area under the ROC curve (AUC), and mean standard error (MSE) for each classification model. As a result of our experiments, the evaluation measurements include accuracy, area under the ROC curve (AUC), and MSE; 99.71%, 99.57%, and 0.0016 for MOL_RF respectively.

Prevalence of Aspirin Resistance and Clinical Characteristics in Patients with Cerebral Infarction

  • Choi, Jong-Tae;Shin, Kyung-A;Kim, Young-Kwon
    • Biomedical Science Letters
    • /
    • v.19 no.3
    • /
    • pp.233-238
    • /
    • 2013
  • Aspirin is still the mainstay of antiplatelet therapy in the cardiovascular and cerebrovascular disease. However, some patients are not responsive to the antithrombotic action of aspirin. The aim of this study was to assess the prevalence and clinical characteristics of aspirin resistance in patients with cerebral infarction. We tested platelet function in 557 patients who had been treated with aspirin in J general hospital. Platelet function was tested using the multiple electrode platelet aggregometry (MEA). Platelet reactivity was expressed as area under the aggregation curve (AUC, U) and >30 AUC was defined as aspirin resistance. Aspirin resistance was detected in 16.2% patients. There was not any significant differences in age, gender between aspirin resistance and aspirin sensitive patients. WBC was significantly higher in patients with aspirin resistance (P < .05). HDL-cholesterol was significantly higher in patients with aspirin sensitive (P < .05). Aspirin resistance was positive correlation with platelet count (r =.314, P =.003). The prevalence of aspirin resistance in cerebral infarction was 16.2%, and platelet count were related with aspirin resistance.

VUS and HUM Represented with Mann-Whitney Statistic

  • Hong, Chong Sun;Cho, Min Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.3
    • /
    • pp.223-232
    • /
    • 2015
  • The area under the ROC curve (AUC), the volume under the ROC surface (VUS) and the hypervolume under the ROC manifold (HUM) are defined and interpreted with probability that measures the discriminant power of classification models. AUC, VUS and HUM are expressed with the summation and integration notations for discrete and continuous random variables, respectively. AUC for discrete two random samples is represented as the nonparametric Mann-Whitney statistic. In this work, we define conditional Mann-Whitney statistics to compare more than two discrete random samples as well as propose that VUS and HUM are represented as functions of the conditional Mann-Whitney statistics. Three and four discrete random samples with some tie values are generated. Values of VUS and HUM are obtained using the proposed statistic. The values of VUS and HUM are identical with those obtained by definition; therefore, both VUS and HUM could be represented with conditional Mann-Whitney statistics proposed in this paper.