Pruning and dynamic scheduling of cost-sensitive ensembles

Authors:
Wei Fan;Fang Chu;Haixun Wang;Philip S. Yu
Affiliations:
IBM T.J.Watson Research, Hawthorne, NY;Computer Science Department, University of California, Los Angeles, CA;IBM T.J. Watson Research, Hawthorne, NY;IBM T.J. Watson Research, Hawthorne, NY
Venue:
Eighteenth national conference on Artificial intelligence
Year:
2002

Citing 7
Cited 9

MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Pruning Adaptive Boosting

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An extensible meta-learning approach for scalable and accurate inductive learning

An extensible meta-learning approach for scalable and accurate inductive learning
Management of intelligent learning agents in distributed data mining systems

Management of intelligent learning agents in distributed data mining systems
Scaling up: distributed machine learning with cooperation

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Greedy regression ensemble selection: Theory and an application to water quality prediction

Information Sciences: an International Journal
Pruning an ensemble of classifiers via reinforcement learning

Neurocomputing
Focused Ensemble Selection: A Diversity-Based Method for Greedy Ensemble Selection

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Adaptive ROC-based ensembles of HMMs applied to anomaly detection

Pattern Recognition
Ensemble pruning via base-classifier replacement

WAIM'11 Proceedings of the 12th international conference on Web-age information management
Expert pruning based on genetic algorithm in regression problems

ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part III
Energy-Based metric for ensemble selection

APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
A competitive ensemble pruning approach based on cross-validation technique

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous research has shown that averaging ensemble can scale up learning over very large cost-sensitive datasets with linear speedup independent of the learning algorithms. At the same time, it achieves the same or even better accuracy than a single model computed from the entire dataset. However, one major drawback is its inefficiency in prediction since every base model in the ensemble has to be consulted in order to produce a final prediction. In this paper, we propose several approaches to reduce the number of base classifiers. Among various methods explored, our empirical studies have shown that the benefit-based greedy approach can safely remove more than 90% of the base models while maintaining or even exceeding the prediction accuracy of the original ensemble. Assuming that each base classifier consumes one unit of prediction time, the removal of 90% of base classifiers translates to a prediction speedup of 10 times. On top of pruning, we propose a novel dynamic scheduling approach to further reduce the "expected" number of classifiers employed in prediction. It measures the confidence of a prediction by a subset of classifiers in the pruned ensemble. This confidence is used to decide if more classifiers are needed in order to produce a prediction that is the same as the original unpruned ensemble. This approach reduces the "expected" number of classifiers by another 25% to 75% without loss of accuracy.