A lazy bagging approach to classification

Authors:
Xingquan Zhu;Ying Yang
Affiliations:
Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA;Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
Venue:
Pattern Recognition
Year:
2008

Citing 18
Cited 2

Instance-Based Learning Algorithms

Machine Learning
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Bagging predictors

Machine Learning
Anytime algorithm development tools

ACM SIGART Bulletin
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Editorial

Artificial Intelligence Review - Special issue on lazy learning
Variance and Bias for General Loss Functions

Machine Learning
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

Machine Learning
Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Unifeid Bias-Variance Decomposition and its Applications

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Unified Bias-Variance Decomposition for Zero-One and Squared Loss

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Ensembles of biased classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Classifying under computational resource constraints: anytime classification using probabilistic estimators

Machine Learning
Lazy Bagging for Classifying Imbalanced Data

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Lazy decision trees

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Nearest neighbor editing aided by unlabeled data

Information Sciences: an International Journal
Cost-conscious comparison of supervised learning algorithms over multiple data sets

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper, we propose lazy bagging (LB), which builds bootstrap replicate bags based on the characteristics of test instances. Upon receiving a test instance x"k, LB trims bootstrap bags by taking into consideration x"k's nearest neighbors in the training data. Our hypothesis is that an unlabeled instance's nearest neighbors provide valuable information to enhance local learning and generate a classifier with refined decision boundaries emphasizing the test instance's surrounding region. In particular, by taking full advantage of x"k's nearest neighbors, classifiers are able to reduce classification bias and variance when classifying x"k. As a result, LB, which is built on these classifiers, can significantly reduce classification error, compared with the traditional bagging (TB) approach. To investigate LB's performance, we first use carefully designed synthetic data sets to gain insight into why LB works and under which conditions it can outperform TB. We then test LB against four rival algorithms on a large suite of 35 real-world benchmark data sets using a variety of statistical tests. Empirical results confirm that LB can statistically significantly outperform alternative methods in terms of reducing classification error.