Creating Ensembles of Classifiers

  • Authors:
  • Nitesh V. Chawla;Steven Eschrich;Lawrence O. Hall

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ensembles of classifiers offer promise in increasing overall classification accuracy. The availability of extremely large datasets has opened avenues for application of distributed and/or parallel learning to efficiently learn models of them. In this paper, distributed learningis done by training classifiers on disjoint subsets of the data. We examine a random partitioning method to create disjoint subsets and propose a more intelligent way of partitioning into disjointsubsets using clustering. It was observed that the intelligent method of partitioning generally performs better than random partitioning for our datasets. In both methods a significant gain in accuracy may be obtained by applying bagging to each of the disjoint subsets, creating multiple diverse classifiers. The significance of our finding is that a partition strategy for even small/moderate sized datasets when combined with bagging can yield better performancethan applying a single learner using the entire dataset.