Linearly Combining Density Estimators via Stacking

Authors:
Padhraic Smyth;David Wolpert
Affiliations:
Information and Computer Science, University of California, Irvine, CA 92697-3425 and the Jet Propulsion Laboratory 525-3660, California Institute of Technology, Pasadena, CA 91109. smyth@ics.uc ...;NASA Ames Research Center, Caelum Research, MS 269-2, Mountain View, CA 94035. dhw@ptolemy.arc.nasa.gov
Venue:
Machine Learning
Year:
1999

Citing 9
Cited 8

Original Contribution: Stacked generalization

Neural Networks
Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization

Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization
Error estimation by series association for neural network systems

Neural Computation
Stacked regressions

Machine Learning
Bagging predictors

Machine Learning
Neural Network Ensembles

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Learning via Stochastic Dynamics

Advances in Neural Information Processing Systems 5, [NIPS Conference]
The existence of a priori distinctions between learning algorithms

Neural Computation

Model selection for probabilistic clustering using cross-validatedlikelihood

Statistics and Computing
Privacy-preserving Distributed Clustering using Generative Models

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Classification Based on Combination of Kernel Density Estimators

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Support feature machine for DNA microarray data

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Classification based on multiple-resolution data view

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Ensembles of probabilistic principal surfaces and competitive evolution on data: two different approaches to data classification

ADA'04 Proceedings of the 3rd international conference on Astronomical Data Analysis
Bandwidth selection in kernel density estimators for multiple-resolution classification

ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part I
Outlier ensembles: position paper

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents experimental results with both real and artificialdata combining unsupervisedlearning algorithms using stacking. Specifically, stacking is used to form a linearcombination of finite mixture model and kernel density estimators fornon-parametric multivariate density estimation. The method outperforms other strategies such as choosing the single best modelbased on cross-validation, combining with uniform weights, and evenusing the single best model chosen by “Cheating” and examining thetest set. We also investigate (1) how the utility of stackingchanges when one of the models being combinedis the model that generated the data, (2) howthe stacking coefficients of the models compare to the relativefrequencies with which cross-validation chooses among the models,(3) visualization of combined “effective” kernels, and (4)the sensitivity of stacking to overfitting as model complexity increases.