Supervised subspace projections for constructing ensembles of classifiers

  • Authors:
  • Nicolás García-Pedrajas;Jesús Maudes-Raedo;César García-Osorio;Juan J. Rodríguez-Díez

  • Affiliations:
  • Department of Computing and Numerical Analysis, University of Córdoba, Spain;Department of Civil Engineering, University of Burgos, Spain1http://pisuerga.inf.ubu.es/ADMIRABLE/.1;Department of Civil Engineering, University of Burgos, Spain1http://pisuerga.inf.ubu.es/ADMIRABLE/.1;Department of Civil Engineering, University of Burgos, Spain1http://pisuerga.inf.ubu.es/ADMIRABLE/.1

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 0.07

Visualization

Abstract

We present a method for constructing ensembles of classifiers using supervised projections of random subspaces. The method combines the philosophy of boosting, focusing on difficult instances, with the improved accuracy achieved by supervised projection methods to obtain very good results in terms of testing error. To achieve both accuracy and diversity, random subspaces are created at each step, and within each random subspace, a supervised projection is obtained using only the misclassified instances. The next classifier is trained using all available examples, in the space given by the supervised projections. The method is compared with AdaBoost and other ensemble methods, showing improved performance on a set of 32 problems from the UCI Machine Learning Repository. In terms of testing error, it obtains results that are significantly better than AdaBoost and random subspace method, using a decision tree as base learner. Furthermore, the robustness of the method in the presence of class label noise is above the results obtained with AdaBoost. A study performed using @k-error diagrams shows that the proposed method improves the results of boosting by obtaining diverse and more accurate classifiers. The decomposition of testing error into bias and variance terms shows that our method performs better than Bagging in terms of reducing the bias term of the error, and better than AdaBoost in terms of reducing the variance term of the error.