Boosting Algorithms for Parallel and Distributed Learning

Authors:
Aleksandar Lazarevic;Zoran Obradovic
Affiliations:
Center for Information Science and Technology, Temple University, 303 Wachman Hall (038-24), 1805 N. Broad St., Philadelphia, PA 19122-6094, USA. aleks@ist.temple.edu;Center for Information Science and Technology, Temple University, 303 Wachman Hall (038-24), 1805 N. Broad St., Philadelphia, PA 19122-6094, USA. zoran@ist.temple.edu
Venue:
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
Year:
2002

Citing 14
Cited 16

Incremental batch learning

Proceedings of the sixth international workshop on Machine learning
Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Game theory, on-line prediction and boosting

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
On the Accuracy of Meta-learning for Scalable Data Mining

Journal of Intelligent Information Systems
The application of AdaBoost for distributed, scalable and on-line learning

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

Data Mining and Knowledge Discovery
Parallel Formulations of Decision-Tree Classification Algorithms

Data Mining and Knowledge Discovery
Feature Subset Selection and Order Identification for Unsupervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Knowledge Discovery in Spatial Databases

KI '99 Proceedings of the 23rd Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence
Comparison of neural networks and discriminant analysis in predicting forest cover types

Comparison of neural networks and discriminant analysis in predicting forest cover types
Scaling up: distributed machine learning with cooperation

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Learning Ensembles from Bites: A Scalable and Accurate Approach

The Journal of Machine Learning Research
On the Tractability of Rule Discovery from Distributed Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Parallelizing AdaBoost by weights dynamics

Computational Statistics & Data Analysis
Privacy-preserving boosting

Data Mining and Knowledge Discovery
Using classifier ensembles to label spatially disjoint data

Information Fusion
Cascade RSVM in Peer-to-Peer Networks

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Induction of multiclass multifeature split decision trees from distributed data

Pattern Recognition
An agent-based framework for distributed learning

Engineering Applications of Artificial Intelligence
Parallel Approach for Ensemble Learning with Locally Coupled Neural Networks

Neural Processing Letters
Detecting and ordering salient regions

Data Mining and Knowledge Discovery
Parallel boosted regression trees for web search ranking

Proceedings of the 20th international conference on World wide web
From centralized to distributed decision tree induction using CHAID and fisher's linear discriminant function algorithms

Intelligent Decision Technologies
Distributed learning with data reduction

Transactions on computational collective intelligence IV
Distributed subgroup mining

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Network game and boosting

ECML'05 Proceedings of the 16th European conference on Machine Learning
Classification in P2P networks with cascade support vector machines

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The growing amount of available information and its distributed and heterogeneous nature has a major impact on the field of data mining. In this paper, we propose a framework for parallel and distributed boosting algorithms intended for efficient integrating specialized classifiers learned over very large, distributed and possibly heterogeneous databases that cannot fit into main computer memory. Boosting is a popular technique for constructing highly accurate classifier ensembles, where the classifiers are trained serially, with the weights on the training instances adaptively set according to the performance of previous classifiers. Our parallel boosting algorithm is designed for tightly coupled shared memory systems with a small number of processors, with an objective of achieving the maximal prediction accuracy in fewer iterations than boosting on a single processor. After all processors learn classifiers in parallel at each boosting round, they are combined according to the confidence of their prediction. Our distributed boosting algorithm is proposed primarily for learning from several disjoint data sites when the data cannot be merged together, although it can also be used for parallel learning where a massive data set is partitioned into several disjoint subsets for a more efficient analysis. At each boosting round, the proposed method combines classifiers from all sites and creates a classifier ensemble on each site. The final classifier is constructed as an ensemble of all classifier ensembles built on disjoint data sets. The new proposed methods applied to several data sets have shown that parallel boosting can achieve the same or even better prediction accuracy considerably faster than the standard sequential boosting. Results from the experiments also indicate that distributed boosting has comparable or slightly improved classification accuracy over standard boosting, while requiring much less memory and computational time since it uses smaller data sets.