A measure oriented training scheme for imbalanced classification problems

Authors:
Bo Yuan;Wenhuang Liu
Affiliations:
Intelligent Computing Lab, Division of Informatics, Graduate School at Shenzhen, Tsinghua University, Shenzhen, P.R. China;Intelligent Computing Lab, Division of Informatics, Graduate School at Shenzhen, Tsinghua University, Shenzhen, P.R. China
Venue:
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Year:
2011

Citing 12
Cited 0

Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
AdaCost: Misclassification Cost-Sensitive Boosting

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins

Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology
Exploratory Under-Sampling for Class-Imbalance Learning

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Modeling wine preferences by data mining from physicochemical properties

Decision Support Systems
A Predictive Model for Identifying Possible MCI to AD Conversions in the ADNI Database

KAM '09 Proceedings of the 2009 Second International Symposium on Knowledge Acquisition and Modeling - Volume 03
Multi-Objective Genetic Programming for Classification with Unbalanced Data

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets

Soft Computing - A Fusion of Foundations, Methodologies and Applications
Using evolutionary multiobjective techniques for imbalanced classification data

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part I
Generating diverse ensembles to counter the problem of class imbalance

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Pareto-Based Multiobjective Machine Learning: An Overview and Case Studies

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Quantified Score

Hi-index	0.00

Visualization

Abstract

Since the overall prediction error of a classifier on imbalanced problems can be potentially misleading and biased, it is commonly evaluated by measures such as G-mean and ROC (Receiver Operating Characteristic) curves. However, for many classifiers, the learning process is still largely driven by error based objective functions. As a result, there is clearly a gap between the measure according to which the classifier is to be evaluated and how the classifier is trained. This paper investigates the possibility of directly using the measure itself to search the hypothesis space to improve the performance of classifiers. Experimental results on three standard benchmark problems and a real-world problem show that the proposed method is effective in comparison with commonly used sampling techniques.