Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Neural Computation
Hierarchical mixtures of experts and the EM algorithm
Neural Computation
Machine Learning
Voting over multiple condensed nearest neighbors
Lazy learning
The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
Boosted mixture of experts: an ensemble learning scheme
Neural Computation
Feature selection for ensembles
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications
Data Mining and Knowledge Discovery
Machine Learning
Error-Correcting Output Codes for Local Learners
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Voting Nearest-Neighbor Subclassifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Neural Computation
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Hi-index | 0.00 |
Combining multiple classifiers is an effective technique for improving classification accuracy by reducing the variance through manipulating the training data distributions. In many large-scale data analysis problems involving heterogeneous databases with attribute instability, however, standard boosting methods do not improve local classifiers (e.g. k-nearest neighbors) due to their low sensitivity to data perturbation. Here, we propose an adaptive attribute boosting technique to coalesce multiple local classifiers each using different relevant attribute information. To reduce the computational costs of k-nearest neighbor (k-NN) classifiers, a novel fast k-NN algorithm is designed. We show that the proposed combining technique is also beneficial when boosting global classifiers like neural networks and decision trees. In addition, a modification of the boosting method is developed for heterogeneous spatial databases with unstable driving attributes by drawing spatial blocks of data at each boosting round. Finally, when heterogeneous data sets contain several homogeneous data distributions, we propose a new technique of boosting specialized classifiers, where instead of a single global classifier for each boosting round, there are specialized classifiers responsible for each homogeneous region. The number of regions is identified through a clustering algorithm performed at each boosting iteration. New boosting methods applied to synthetic spatial data and real life spatial data show improvements in prediction accuracy for both local and global classifiers when unstable driving attributes and heterogeneity are present in the data. In addition, boosting specialized experts significantly reduces the number of iterations needed for achieving the maximal prediction accuracy.