C4.5: programs for machine learning
C4.5: programs for machine learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Machine Learning
Using Rule Sets to Maximize ROC Performance
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Towards tight bounds for rule learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Experimental perspectives on learning from imbalanced data
Proceedings of the 24th international conference on Machine learning
Selective Pre-processing of Imbalanced Data for Improving Classification Performance
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Machine Learning and Data Mining: Introduction to Principles and Algorithms
Machine Learning and Data Mining: Introduction to Principles and Algorithms
IEEE Transactions on Knowledge and Data Engineering
BIBE '09 Proceedings of the 2009 Ninth IEEE International Conference on Bioinformatics and Bioengineering
Ensembles of Abstaining Classifiers Based on Rule Sets
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
An empirical study of the behavior of classifiers on imbalanced and overlapped data sets
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Boosting support vector machines for imbalanced data sets
Knowledge and Information Systems
Integrating selective pre-processing of imbalanced data with Ivotes ensemble
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Learning from imbalanced data in presence of noisy and borderline examples
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
A comparison of three voting methods for bagging with the MLEM2 algorithm
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Classifying severely imbalanced data
Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Hi-index | 0.00 |
In the paper we present IIvotes --a new framework for constructing an ensemble of classifiers from imbalanced data. IIvotes incorporates the SPIDER method for selective data pre-processing into the adaptive Ivotes ensemble. Such an integration is aimed at improving balance between sensitivity and specificity evaluated by the G-mean measure for the minority class in comparison with single classifiers also combined with SPIDER. Using SPIDER to pre-process specific learning samples inside the ensemble improves sensitivity of derived component classifiers. At the same time the controlling mechanism of IIvotes ensures that overall accuracy and thus specificity is kept at a reasonable level. The new proposed IIvotes ensemble was thoroughly evaluated in a series of experiments where we tested it with symbolic decision trees and rules and non-symbolic Naive Bayes component classifiers. The results confirmed that combining SPIDER with an ensemble improved the performance in terms of the G-mean measures in comparison to a single classifier with SPIDER for all tested types of classifiers and two SPIDER pre-processing options weak and strong amplification. These advantages were especially evident for decision trees and rules where differences between single and ensemble classifiers with SPIDER were more significant for both pre-processing options than for Naive Bayes. Moreover, the results demonstrated advantages of using a special abstaining classification strategy inside IIvotes rule ensembles, where component rule-based classifiers may refrain from predicting a class when in doubt. Abstaining rule ensembles performed much better with regard to G-mean than their non-abstaining variants.