DOI QR코드

DOI QR Code

Heterogeneous Ensemble of Classifiers from Under-Sampled and Over-Sampled Data for Imbalanced Data

Kang, Dae-Ki;Han, Min-gyu

  • Received : 2019.01.16
  • Accepted : 2019.01.30
  • Published : 2019.03.31

Abstract

Data imbalance problem is common and causes serious problem in machine learning process. Sampling is one of the effective methods for solving data imbalance problem. Over-sampling increases the number of instances, so when over-sampling is applied in imbalanced data, it is applied to minority instances. Under-sampling reduces instances, which usually is performed on majority data. We apply under-sampling and over-sampling to imbalanced data and generate sampled data sets. From the generated data sets from sampling and original data set, we construct a heterogeneous ensemble of classifiers. We apply five different algorithms to the heterogeneous ensemble. Experimental results on an intrusion detection dataset as an imbalanced datasets show that our approach shows effective results.

Keywords

Over-sampling;Under-sampling;Heterogeneous ensemble;Imbalanced data

Acknowledgement

Supported by : National Research Foundation of Korea(NRF)