Risk function estimation for subproblems in a hierarchical classifier

  • Authors:
  • I. T. Podolak;A. Roman

  • Affiliations:
  • Institute of Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Cracow, Poland;Institute of Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Cracow, Poland

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2011

Quantified Score

Hi-index 0.10

Visualization

Abstract

One of the solutions to the classification problem are the ensemble methods, in particular a hierarchical approach. This method bases on dynamically splitting the original problem during training into smaller subproblems which should be easier to train. Then the answers are combined together to obtain the final classification. The main problem here is how to divide (cluster) the original problem to obtain best possible accuracy expressed in terms of risk function value. The exact value for a given clustering is known only after the whole training process. In this paper we propose the risk estimation method based on the analysis of the root classifier. This makes it possible to evaluate the risks for all subproblems without any training of children classifiers. Together with some earlier theoretical results on hierarchical approach, we show how to use the proposed method to evaluate the risk for the whole ensemble. A variant, which uses a genetic algorithm (GA), is proposed. We compare this method with an earlier one, based on the Bayes law. We show that the subproblem risk evaluation is highly correlated with the true risk, and that the Bayes/GA approaches give hierarchical classifiers which are superior to single ones. Our method works for any classifier which returns a class probability vector for a given example.