Randomizing Outputs to Increase Prediction Accuracy

Authors:
Leo Breiman
Affiliations:
Statistics Department, University of California, Berkeley, CA 94720, USA. leo@stat.berkeley.edu
Venue:
Machine Learning
Year:
2000

Citing 5
Cited 26

Neural networks and the bias/variance dilemma

Neural Computation
Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
The effects of adding noise during backpropagation training on a generalization performance

Neural Computation

Backpropagation in Decision Trees for Regression

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Closed-form dual perturb and combine for tree-based models

ICML '05 Proceedings of the 22nd international conference on Machine learning
Effective Estimation of Posterior Probabilities: Explaining the Accuracy of Randomized Decision Tree Approaches

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Adapting k-means for supervised clustering

Applied Intelligence
Extremely randomized trees

Machine Learning
A cooperative constructive method for neural networks for pattern recognition

Pattern Recognition
Non-strict heterogeneous Stacking

Pattern Recognition Letters
Pruning extensions to stacking

Intelligent Data Analysis
Heterogeneous stacking for classification-driven watershed segmentation

EURASIP Journal on Advances in Signal Processing
Class-switching neural network ensembles

Neurocomputing
Diversity of ability and cognitive style for group decision processes

Information Sciences: an International Journal
Unsupervised Hierarchical Weighted Multi-segmenter

MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Spectrum of variable-random trees

Journal of Artificial Intelligence Research
Switching class labels to generate classification ensembles

Pattern Recognition
Ensemble construction via designed output distortion

MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Ensembles of jittered association rule classifiers

Data Mining and Knowledge Discovery
Non-invasive estimate of blood glucose and blood pressure from a photoplethysmograph by means of machine learning techniques

Artificial Intelligence in Medicine
An experimental study of one- and two-level classifier fusion for different sample sizes

Pattern Recognition Letters
Decision forests with oblique decision trees

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Improving on bagging with input smearing

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Building ensembles of neural networks with class-switching

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Maximizing tree diversity by building complete-random decision trees

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Diversity regularized machine

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Ensemble approaches for regression: A survey

ACM Computing Surveys (CSUR)
How large should ensembles of classifiers be?

Pattern Recognition
Multivariate convex regression with adaptive partitioning

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bagging and boosting reduce error by changing both the inputs and outputs to form perturbed training sets, growing predictors on these perturbed training sets and combining them. An interesting question is whether it is possible to get comparable performance by perturbing the outputs alone. Two methods of randomizing outputs are experimented with. One is called output smearing and the other output flipping. Both are shown to consistently do better than bagging.