Diversity versus Quality in Classification Ensembles Based on Feature Selection

Authors:
Padraig Cunningham;John Carney
Affiliations:
-;-
Venue:
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Year:
2000

Citing 4
Cited 22

A Review and Empirical Evaluation of Feature Weighting Methods for aClass of Lazy Learning Algorithms

Artificial Intelligence Review - Special issue on lazy learning
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Nearest Neighbors in Random Subspaces

SSPR '98/SPR '98 Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Using Introspective Learning to Improve Retrieval in CBR: A Case Study in Air Traffic Control

ICCBR '97 Proceedings of the Second International Conference on Case-Based Reasoning Research and Development

A Dynamic Approach to Reducing Dialog in On-Line Decision Guides

EWCBR '00 Proceedings of the 5th European Workshop on Advances in Case-Based Reasoning
An Approach to Aggregating Ensembles of Lazy Learners That Supports Explanation

ECCBR '02 Proceedings of the 6th European Conference on Advances in Case-Based Reasoning
Case Representation Issues for Case-Based Reasoning from Ensemble Research

ICCBR '01 Proceedings of the 4th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Using Diversity with Three Variants of Boosting: Aggressive, Conservative, and Inverse

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Using Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Theoretical Bounds of Majority Voting Performance for a Binary Classification Problem

IEEE Transactions on Pattern Analysis and Machine Intelligence
k-NN Aggregation with a Stacked Email Representation

ECCBR '08 Proceedings of the 9th European conference on Advances in Case-Based Reasoning
Dynamic adaptive ensemble case-based reasoning: application to stock market prediction

Expert Systems with Applications: An International Journal
Co-training with relevant random subspaces

Neurocomputing
Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction

Artificial Intelligence in Medicine
A class-specific ensemble feature selection approach for classification problems

Proceedings of the 48th Annual Southeast Regional Conference
Search strategies for ensemble feature selection in medical diagnostics

CBMS'03 Proceedings of the 16th IEEE conference on Computer-based medical systems
Boosting feature selection

ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
An evolutionary and attribute-oriented ensemble classifier

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
Observations on boosting feature selection

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Succinct and informative cluster descriptions for document repositories

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Explaining the output of ensembles in medical decision support on a case by case basis

Artificial Intelligence in Medicine
Ensemble-based regression analysis of multimodal medical data for osteopenia diagnosis

Expert Systems with Applications: An International Journal
A survey of multiple classifier systems as hybrid systems

Information Fusion
Dissimilarity based ensemble of extreme learning machine for gene expression data classification

Neurocomputing
Combining multiple predictive models using genetic algorithms

Intelligent Data Analysis - Combined Learning Methods and Mining Complex Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Feature subset-selection has emerged as a useful technique for creating diversity in ensembles - particularly in classification ensembles. In this paper we argue that this diversity needs to be monitored in the creation of the ensemble. We propose an entropy measure of the outputs of the ensemble members as a useful measure of the ensemble diversity. Further, we show that using the associated conditional entropy as a loss function (error measure) works well and the entropy in the ensemble predicts well the reduction in error due to the ensemble. These measures are evaluated on a medical prediction problem and are shown to predict the performance of the ensemble well. We also show that the entropy measure of diversity has the added advantage that it seems to model the change in diversity with the size of the ensemble.