Ensemble pruning via individual contribution ordering

Authors:
Zhenyu Lu;Xindong Wu;Xingquan Zhu;Josh Bongard
Affiliations:
University of Vermont, Burlington, VT, USA;University of Vermont, Burlington, VT, USA;University of Technology, Sydney, Australia;University of Vermont, Burlington, VT, USA
Venue:
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2010

Citing 23
Cited 7

Original Contribution: Stacked generalization

Neural Networks
C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Random Forests

Machine Learning
Ensembling neural networks: many could be better than all

Artificial Intelligence
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

Machine Learning
On the Boosting Pruning Problem

ECML '00 Proceedings of the 11th European Conference on Machine Learning
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Pruning Adaptive Boosting

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Is random model better? On its accuracy and efficiency

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An empirical comparison of supervised learning algorithms

ICML '06 Proceedings of the 23rd international conference on Machine learning
Pruning in ordered bagging ensembles

ICML '06 Proceedings of the 23rd international conference on Machine learning
Getting the Most Out of Ensemble Selection

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Ensemble Pruning Via Semi-definite Programming

The Journal of Machine Learning Research
Top 10 algorithms in data mining

Knowledge and Information Systems
An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation

IEEE Transactions on Pattern Analysis and Machine Intelligence
New ensemble methods for evolving data streams

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Focused Ensemble Selection: A Diversity-Based Method for Greedy Ensemble Selection

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Selective ensemble of decision trees

RSFDGrC'03 Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing

Enabling fast prediction for ensemble models on data streams

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Ensemble pruning via base-classifier replacement

WAIM'11 Proceedings of the 12th international conference on Web-age information management
A new metric for greedy ensemble pruning

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Ensemble pruning for text categorization based on data partitioning

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Margin optimization based pruning for random forest

Neurocomputing
Applying Ant Colony Optimization to configuring stacking ensembles for data mining

Expert Systems with Applications: An International Journal
An effective ensemble pruning algorithm based on frequent patterns

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

An ensemble is a set of learned models that make decisions collectively. Although an ensemble is usually more accurate than a single learner, existing ensemble methods often tend to construct unnecessarily large ensembles, which increases the memory consumption and computational cost. Ensemble pruning tackles this problem by selecting a subset of ensemble members to form subensembles that are subject to less resource consumption and response time with accuracy that is similar to or better than the original ensemble. In this paper, we analyze the accuracy/diversity trade-off and prove that classifiers that are more accurate and make more predictions in the minority group are more important for subensemble construction. Based on the gained insights, a heuristic metric that considers both accuracy and diversity is proposed to explicitly evaluate each individual classifier's contribution to the whole ensemble. By incorporating ensemble members in decreasing order of their contributions, subensembles are formed such that users can select the top $p$ percent of ensemble members, depending on their resource availability and tolerable waiting time, for predictions. Experimental results on 26 UCI data sets show that subensembles formed by the proposed EPIC (Ensemble Pruning via Individual Contribution ordering) algorithm outperform the original ensemble and a state-of-the-art ensemble pruning method, Orientation Ordering (OO).