Adaptive boosting techniques in heterogeneous and spatial databases

Authors:
Aleksandar Lazarevic;Zoran Obradovic
Affiliations:
Center for Information Science and Technology, Temple University, Room 303, Wachman Hall (038-24), 1805 N. Broad St., Philadelphia, PA 19122, USA. Tel.: +1 215 204 6265/ Fax: +1 215 204 5082/ E-ma ...;Center for Information Science and Technology, Temple University, Room 303, Wachman Hall (038-24), 1805 N. Broad St., Philadelphia, PA 19122, USA. Tel.: +1 215 204 6265/ Fax: +1 215 204 5082/ E-ma ...
Venue:
Intelligent Data Analysis
Year:
2001

Citing 18
Cited 0

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Local learning algorithms

Neural Computation
Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Bagging predictors

Machine Learning
Voting over multiple condensed nearest neighbors

Lazy learning
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Boosted mixture of experts: an ensemble learning scheme

Neural Computation
Feature selection for ensembles

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

Data Mining and Knowledge Discovery
Induction of Decision Trees

Machine Learning
Error-Correcting Output Codes for Local Learners

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Pruning Adaptive Boosting

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Voting Nearest-Neighbor Subclassifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Boosting Neural Networks

Neural Computation
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Combining multiple classifiers is an effective technique for improving classification accuracy by reducing the variance through manipulating the training data distributions. In many large-scale data analysis problems involving heterogeneous databases with attribute instability, however, standard boosting methods do not improve local classifiers (e.g. k-nearest neighbors) due to their low sensitivity to data perturbation. Here, we propose an adaptive attribute boosting technique to coalesce multiple local classifiers each using different relevant attribute information. To reduce the computational costs of k-nearest neighbor (k-NN) classifiers, a novel fast k-NN algorithm is designed. We show that the proposed combining technique is also beneficial when boosting global classifiers like neural networks and decision trees. In addition, a modification of the boosting method is developed for heterogeneous spatial databases with unstable driving attributes by drawing spatial blocks of data at each boosting round. Finally, when heterogeneous data sets contain several homogeneous data distributions, we propose a new technique of boosting specialized classifiers, where instead of a single global classifier for each boosting round, there are specialized classifiers responsible for each homogeneous region. The number of regions is identified through a clustering algorithm performed at each boosting iteration. New boosting methods applied to synthetic spatial data and real life spatial data show improvements in prediction accuracy for both local and global classifiers when unstable driving attributes and heterogeneity are present in the data. In addition, boosting specialized experts significantly reduces the number of iterations needed for achieving the maximal prediction accuracy.