Evaluation of a new hybrid algorithm for highly imbalanced classification problems

Authors:
Hernán Ahumada;Guillermo L. Grinblat;Lucas C. Uzal;Alejandro Ceccatto;Pablo M. Granitto
Affiliations:
CIFASIS, French Argentine International Center for Information and Systems, Sciences, UPCAM France / UNR--CONICET Argentina, Bv 27 de Febrero 210 Bis, 2000 Rosario, Argentina;CIFASIS, French Argentine International Center for Information and Systems, Sciences, UPCAM France / UNR--CONICET Argentina, Bv 27 de Febrero 210 Bis, 2000 Rosario, Argentina;CIFASIS, French Argentine International Center for Information and Systems, Sciences, UPCAM France / UNR--CONICET Argentina, Bv 27 de Febrero 210 Bis, 2000 Rosario, Argentina;CIFASIS, French Argentine International Center for Information and Systems, Sciences, UPCAM France / UNR--CONICET Argentina, Bv 27 de Febrero 210 Bis, 2000 Rosario, Argentina;CIFASIS, French Argentine International Center for Information and Systems, Sciences, UPCAM France / UNR--CONICET Argentina, Bv 27 de Febrero 210 Bis, 2000 Rosario, Argentina
Venue:
International Journal of Hybrid Intelligent Systems
Year:
2011

Citing 12
Cited 1

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Adaptive Fraud Detection

Data Mining and Knowledge Discovery
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Experimental perspectives on learning from imbalanced data

Proceedings of the 24th international conference on Machine learning
REPMAC: A New Hybrid Approach to Highly Imbalanced Classification Problems

HIS '08 Proceedings of the 2008 8th International Conference on Hybrid Intelligent Systems
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Large-scale investigation of weed seed identification by machine vision

Computers and Electronics in Agriculture
On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets

Information Sciences: an International Journal
A New Performance Measure for Class Imbalance Learning. Application to Bioinformatics Problems

ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
FSVM-CIL: fuzzy support vector machines for class imbalance learning

IEEE Transactions on Fuzzy Systems - Special section on computing with words
An unsupervised self-organizing learning with support vector ranking for imbalanced datasets

Expert Systems with Applications: An International Journal
Cluster-Based sampling approaches to imbalanced data distributions

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery

Novel approaches for classification based on Cuckoo Search Strategy

International Journal of Hybrid Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many times in classification problems, particularly in critical real world applications, one of the classes has much less samples than the others usually known as the class imbalance problem. In this work we discuss and evaluate the use of the REPMAC algorithm to solve imbalanced problems. Using a clustering method, REPMAC recursively splits the majority class in several subsets, creating a decision tree, until the resulting sub-problems are balanced or easy to solve. We use two diverse clustering methods and three different classifiers coupled with REPMAC to evaluate the new method on several benchmark datasets spanning a wide range of number of features, samples and imbalance degree. We also apply our method to a real world problem, the identification of weed seeds. We find that the good performance of REPMAC is almost independent of the classifier or the clustering method coupled to it, which suggests that its success is mostly related to the use of an appropriate strategy to cope with imbalanced problems.