Proceedings of the sixth international workshop on Machine learning
The application of AdaBoost for distributed, scalable and on-line learning
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Comparison of neural networks and discriminant analysis in predicting forest cover types
Comparison of neural networks and discriminant analysis in predicting forest cover types
Distributed Pasting of Small Votes
MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Sharing Classifiers among Ensembles from Related Problem Domains
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Distributed classification in peer-to-peer networks
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Induction of multiclass multifeature split decision trees from distributed data
Pattern Recognition
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
A collaborative training algorithm for distributed learning
IEEE Transactions on Information Theory
PLANET: massively parallel learning of tree ensembles with MapReduce
Proceedings of the VLDB Endowment
A comparison study of strategies for combining classifiers from distributed data sources
ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Hierarchical distributed data classification in wireless sensor networks
Computer Communications
An A-Team approach to learning classifiers from distributed data sources
International Journal of Intelligent Information and Database Systems
Distributed learning with data reduction
Transactions on computational collective intelligence IV
Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems
ECML'05 Proceedings of the 16th European conference on Machine Learning
HyParSVM: a new hybrid parallel software for support vector machine learning on SMP clusters
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Peer-to-peer distributed text classifier learning in PADMINI
Statistical Analysis and Data Mining
Hi-index | 0.06 |
In this paper, we propose a general framework for distributed boosting intended for efficient integrating specialized classifiers learned over very large and distributed homogeneous databases that cannot be merged at a single location. Our distributed boosting algorithm can also be used as a parallel classification technique, where a massive database that cannot fit into main computer memory is partitioned into disjoint subsets for a more efficient analysis. In the proposed method, at each boosting round the classifiers are first learned from disjoint datasets and then exchanged amongst the sites. Finally the classifiers are combined into a weighted voting ensemble on each disjoint data set. The ensemble that is applied to an unseen test set represents an ensemble of ensembles built on all distributed sites. In experiments performed on four large data sets the proposed distributed boosting method achieved classification accuracy comparable or even slightly better than the standard boosting algorithm while requiring less memory and less computational time. In addition, the communication overhead of the distributed boosting algorithm is very small making it a viable alternative to the standard boosting for large-scale databases.