Creating Ensembles of Classifiers

Authors:
Nitesh V. Chawla;Steven Eschrich;Lawrence O. Hall
Affiliations:
-;-;-
Venue:
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Year:
2001

Citing 0
Cited 10

Distributed Pasting of Small Votes

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Learning Ensembles from Bites: A Scalable and Accurate Approach

The Journal of Machine Learning Research
Efficient sampling of training set in large and noisy multimedia data

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A framework for agent-based distributed machine learning and data mining

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Prototype selection algorithms for distributed learning

Pattern Recognition
MALEF: Framework for distributed machine learning and data mining

International Journal of Intelligent Information and Database Systems
An agent-based framework for distributed learning

Engineering Applications of Artificial Intelligence
Application of the Method of Editing and Condensing in the Process of Global Decision-making

Fundamenta Informaticae
Distributed learning with data reduction

Transactions on computational collective intelligence IV
Application of Reduction of the Set of Conditional Attributes in the Process of Global Decision-making

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ensembles of classifiers offer promise in increasing overall classification accuracy. The availability of extremely large datasets has opened avenues for application of distributed and/or parallel learning to efficiently learn models of them. In this paper, distributed learningis done by training classifiers on disjoint subsets of the data. We examine a random partitioning method to create disjoint subsets and propose a more intelligent way of partitioning into disjointsubsets using clustering. It was observed that the intelligent method of partitioning generally performs better than random partitioning for our datasets. In both methods a significant gain in accuracy may be obtained by applying bagging to each of the disjoint subsets, creating multiple diverse classifiers. The significance of our finding is that a partition strategy for even small/moderate sized datasets when combined with bagging can yield better performancethan applying a single learner using the entire dataset.