Bagging with asymmetric costs for misclassified and correctly classified examples

Authors:
Ricardo Ñanculef;Carlos Valle;Héctor Allende;Claudio Moraga
Affiliations:
Universidad Técnica Federico Santa María, Departamento de Informática, Valparaíso, Chile;Universidad Técnica Federico Santa María, Departamento de Informática, Valparaíso, Chile;Universidad Técnica Federico Santa María, Departamento de Informática, Valparaíso, Chile;European Centre for Soft Computing, Mieres, Asturias, Spain and Dortmund University, Spain, Dortmund, Germany
Venue:
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Year:
2007

Citing 7
Cited 0

Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

Machine Learning
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Bagging Equalizes Influence

Machine Learning
Negative correlation learning and the ambiguity family of ensemble methods

MCS'03 Proceedings of the 4th international conference on Multiple classifier systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Diversity is a key characteristic to obtain advantages of combining predictors. In this paper, we propose a modification of bagging to explicitly trade off diversity and individual accuracy. The procedure consists in dividing the bootstrap replicates obtained at each iteration of the algorithm in two subsets: one consisting of the examples misclassified by the ensemble obtained at the previous iteration, and the other consisting of the examples correctly recognized. A high individual accuracy of a new classifier on the first subset increases diversity, measured as the value of the Q statistic between the new classifier and the existing classifier ensemble. A high accuracy on the second subset on the other hand, decreases diversity. We trade off between both components of the individual accuracy using a parameter λ ∈ [0, 1] that changes the cost of a misclassification on the second subset. Experiments are provided using well-known classification problems obtained from UCI. Results are also compared with boosting and bagging.