Distributed Pasting of Small Votes

  • Authors:
  • Nitesh V. Chawla;Lawrence O. Hall;Kevin W. Bowyer;Thomas E. Moore;W. Philip Kegelmeyer

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bagging and boosting are two popular ensemble methods that achieve better accuracy than a single classifier. These techniques have limitations on massive datasets, as the size of the dataset can be a bottleneck. Voting many classifiers built on small subsets of data ("pasting small votes") is a promising approach for learning from massive datasets. Pasting small votes can utilize the power of boosting and bagging, and potentially scale up to massive datasets. We propose a framework for building hundreds or thousands of such classifiers on small subsets of data in a distributed environment. Experiments show this approach is fast, accurate, and scalable to massive datasets.