Cost-sensitive decision tree ensembles for effective imbalanced classification

Authors:
Bartosz Krawczyk;Michał Woniak;Gerald Schaefer
Affiliations:
-;-;-
Venue:
Applied Soft Computing
Year:
2014

Citing 28
Cited 0

The Use of Background Knowledge in Decision Tree Induction

Machine Learning
Genetic algorithms: A Survey

Computer
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Overview and Comparison of Voting Methods for Pattern Recognition

IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Combined 5 × 2 cv F Test for Comparing Supervised Classification Learning Algorithms

Neural Computation
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Cost-sensitive boosting for classification of imbalanced data

Pattern Recognition
FAST: a roc-based feature selection metric for small samples and imbalanced data classification problems

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
On the k-NN performance in a challenging scenario of imbalance and overlapping

Pattern Analysis & Applications - Special Issue: Non-parametric distance-based classification techniques and their applications
Some Remarks on Chosen Methods of Classifier Fusion Based on Weighted Voting

HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
Learning from Imbalanced Data

IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Exploratory undersampling for class-imbalance learning

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Algorithm of designing compound recognition system on the basis of combining classifiers with simultaneous splitting feature space into competence areas

Pattern Analysis & Applications
Association rule mining-based dissolved gas analysis for fault diagnosis of power transformers

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs

Pattern Recognition
Integrating selective pre-processing of imbalanced data with Ivotes ensemble

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
RAMOBoost: ranked minority oversampling in boosting

IEEE Transactions on Neural Networks
Evolutionary Computation for Modeling and Optimization

Evolutionary Computation for Modeling and Optimization
Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics

Expert Systems with Applications: An International Journal
Designing fusers on the basis of discriminants – evolutionary and neural methods of training

HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part I
Identification of different types of minority class examples in imbalanced data

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Combining diverse one-class classifiers

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Face Recognition Using Total Margin-Based Adaptive Fuzzy Support Vector Machines

IEEE Transactions on Neural Networks
Oversampling methods for classification of imbalanced breast cancer malignancy data

ICCVG'12 Proceedings of the 2012 international conference on Computer Vision and Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Real-life datasets are often imbalanced, that is, there are significantly more training samples available for some classes than for others, and consequently the conventional aim of reducing overall classification accuracy is not appropriate when dealing with such problems. Various approaches have been introduced in the literature to deal with imbalanced datasets, and are typically based on oversampling, undersampling or cost-sensitive classification. In this paper, we introduce an effective ensemble of cost-sensitive decision trees for imbalanced classification. Base classifiers are constructed according to a given cost matrix, but are trained on random feature subspaces to ensure sufficient diversity of the ensemble members. We employ an evolutionary algorithm for simultaneous classifier selection and assignment of committee member weights for the fusion process. Our proposed algorithm is evaluated on a variety of benchmark datasets, and is confirmed to lead to improved recognition of the minority class, to be capable of outperforming other state-of-the-art algorithms, and hence to represent a useful and effective approach for dealing with imbalanced datasets.