Distributed Pasting of Small Votes
MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Learning Ensembles from Bites: A Scalable and Accurate Approach
The Journal of Machine Learning Research
Efficient sampling of training set in large and noisy multimedia data
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A framework for agent-based distributed machine learning and data mining
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Prototype selection algorithms for distributed learning
Pattern Recognition
MALEF: Framework for distributed machine learning and data mining
International Journal of Intelligent Information and Database Systems
An agent-based framework for distributed learning
Engineering Applications of Artificial Intelligence
Application of the Method of Editing and Condensing in the Process of Global Decision-making
Fundamenta Informaticae
Distributed learning with data reduction
Transactions on computational collective intelligence IV
Hi-index | 0.00 |
Ensembles of classifiers offer promise in increasing overall classification accuracy. The availability of extremely large datasets has opened avenues for application of distributed and/or parallel learning to efficiently learn models of them. In this paper, distributed learningis done by training classifiers on disjoint subsets of the data. We examine a random partitioning method to create disjoint subsets and propose a more intelligent way of partitioning into disjointsubsets using clustering. It was observed that the intelligent method of partitioning generally performs better than random partitioning for our datasets. In both methods a significant gain in accuracy may be obtained by applying bagging to each of the disjoint subsets, creating multiple diverse classifiers. The significance of our finding is that a partition strategy for even small/moderate sized datasets when combined with bagging can yield better performancethan applying a single learner using the entire dataset.