Adaptive boosting techniques in heterogeneous and spatial databases

  • Authors:
  • Aleksandar Lazarevic;Zoran Obradovic

  • Affiliations:
  • Center for Information Science and Technology, Temple University, Room 303, Wachman Hall (038-24), 1805 N. Broad St., Philadelphia, PA 19122, USA. Tel.: +1 215 204 6265/ Fax: +1 215 204 5082/ E-ma ...;Center for Information Science and Technology, Temple University, Room 303, Wachman Hall (038-24), 1805 N. Broad St., Philadelphia, PA 19122, USA. Tel.: +1 215 204 6265/ Fax: +1 215 204 5082/ E-ma ...

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Combining multiple classifiers is an effective technique for improving classification accuracy by reducing the variance through manipulating the training data distributions. In many large-scale data analysis problems involving heterogeneous databases with attribute instability, however, standard boosting methods do not improve local classifiers (e.g. k-nearest neighbors) due to their low sensitivity to data perturbation. Here, we propose an adaptive attribute boosting technique to coalesce multiple local classifiers each using different relevant attribute information. To reduce the computational costs of k-nearest neighbor (k-NN) classifiers, a novel fast k-NN algorithm is designed. We show that the proposed combining technique is also beneficial when boosting global classifiers like neural networks and decision trees. In addition, a modification of the boosting method is developed for heterogeneous spatial databases with unstable driving attributes by drawing spatial blocks of data at each boosting round. Finally, when heterogeneous data sets contain several homogeneous data distributions, we propose a new technique of boosting specialized classifiers, where instead of a single global classifier for each boosting round, there are specialized classifiers responsible for each homogeneous region. The number of regions is identified through a clustering algorithm performed at each boosting iteration. New boosting methods applied to synthetic spatial data and real life spatial data show improvements in prediction accuracy for both local and global classifiers when unstable driving attributes and heterogeneity are present in the data. In addition, boosting specialized experts significantly reduces the number of iterations needed for achieving the maximal prediction accuracy.