Distributed Pasting of Small Votes

Authors:
Nitesh V. Chawla;Lawrence O. Hall;Kevin W. Bowyer;Thomas E. Moore;W. Philip Kegelmeyer
Affiliations:
-;-;-;-;-
Venue:
MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Year:
2002

Citing 11
Cited 6

C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
Efficient progressive sampling

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
The distributed boosting algorithm

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Pasting Small Votes for Classification in Large Databases and On-Line

Machine Learning
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Creating Ensembles of Classifiers

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Different Ways of Weakening Decision Trees and Their Impact on Classification Accuracy of DT Combination

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Learning Rules from Distributed Data

Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Scaling up: distributed machine learning with cooperation

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Learning Ensembles from Bites: A Scalable and Accurate Approach

The Journal of Machine Learning Research
An Efficient and Sensitive Decision Tree Approach to Mining Concept-Drifting Data Streams

Informatica
Cascade RSVM in Peer-to-Peer Networks

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Online parallel boosting

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Improving accuracy and speed of optimum-path forest classifier using combination of disjoint training subsets

MCS'11 Proceedings of the 10th international conference on Multiple classifier systems
Classification in P2P networks with cascade support vector machines

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bagging and boosting are two popular ensemble methods that achieve better accuracy than a single classifier. These techniques have limitations on massive datasets, as the size of the dataset can be a bottleneck. Voting many classifiers built on small subsets of data ("pasting small votes") is a promising approach for learning from massive datasets. Pasting small votes can utilize the power of boosting and bagging, and potentially scale up to massive datasets. We propose a framework for building hundreds or thousands of such classifiers on small subsets of data in a distributed environment. Experiments show this approach is fast, accurate, and scalable to massive datasets.