Bagging with asymmetric costs for misclassified and correctly classified examples

  • Authors:
  • Ricardo Ñanculef;Carlos Valle;Héctor Allende;Claudio Moraga

  • Affiliations:
  • Universidad Técnica Federico Santa María, Departamento de Informática, Valparaíso, Chile;Universidad Técnica Federico Santa María, Departamento de Informática, Valparaíso, Chile;Universidad Técnica Federico Santa María, Departamento de Informática, Valparaíso, Chile;European Centre for Soft Computing, Mieres, Asturias, Spain and Dortmund University, Spain, Dortmund, Germany

  • Venue:
  • CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Diversity is a key characteristic to obtain advantages of combining predictors. In this paper, we propose a modification of bagging to explicitly trade off diversity and individual accuracy. The procedure consists in dividing the bootstrap replicates obtained at each iteration of the algorithm in two subsets: one consisting of the examples misclassified by the ensemble obtained at the previous iteration, and the other consisting of the examples correctly recognized. A high individual accuracy of a new classifier on the first subset increases diversity, measured as the value of the Q statistic between the new classifier and the existing classifier ensemble. A high accuracy on the second subset on the other hand, decreases diversity. We trade off between both components of the individual accuracy using a parameter λ ∈ [0, 1] that changes the cost of a misclassification on the second subset. Experiments are provided using well-known classification problems obtained from UCI. Results are also compared with boosting and bagging.