Unsupervized data-driven partitioning of multiclass problems
ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
Evaluation of a new hybrid algorithm for highly imbalanced classification problems
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
The class imbalance problem (when one of the classes has much less samples than the others) is of great importance in machine learning, because it corresponds to many critical applications. In this work we introduce the Recursive Partitioning of the Majority Class (REPMAC) algorithm, a new hybrid method to solve imbalanced problems. Using a clustering method, REPMAC recursively splits the majority class in several subsets, creating a decision tree, until the resulting sub-problems are balanced or easy to solve. At that point, a classifier is fitted to each sub-problem. We evaluate the new method on 7 datasets from the UCI repository, finding that REPMAC is more efficient than other methods usually applied to imbalanced datasets.