A Study on the Adjustment of Posterior Probability for Oversampling when the Target is Rare Kim, U.N.; Lee, S.K.; Choi, J.H.;
When an event of target variable is rare, a widespread strategy is to build a model on the sample that disproportionally over-represents the events, that is over-sampled. Using the data over-sampled from the original data set, the predicted values would be biased; however, it can be easily corrected to represent the population. In this study, we investigate into the relationship between the proportion of rare event on a data-mart and the model performance using real world data of a Korean credit card company. Also, we use the methods for adjusting of posterior probability for over-sampled data of the offset method and the weighted method. Finally, we compare the performance of the methods using real data sets.
Over-sampling;adjusting of posterior probability;rare event offset method;weighted method;