Original Contribution: Stacked generalization
Neural Networks
Hierarchical mixtures of experts and the EM algorithm
Neural Computation
Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization
Error estimation by series association for neural network systems
Neural Computation
Machine Learning
Machine Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Learning via Stochastic Dynamics
Advances in Neural Information Processing Systems 5, [NIPS Conference]
The existence of a priori distinctions between learning algorithms
Neural Computation
Model selection for probabilistic clustering using cross-validatedlikelihood
Statistics and Computing
Privacy-preserving Distributed Clustering using Generative Models
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Classification Based on Combination of Kernel Density Estimators
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Support feature machine for DNA microarray data
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Classification based on multiple-resolution data view
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
ADA'04 Proceedings of the 3rd international conference on Astronomical Data Analysis
Bandwidth selection in kernel density estimators for multiple-resolution classification
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part I
Outlier ensembles: position paper
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
This paper presents experimental results with both real and artificialdata combining unsupervisedlearning algorithms using stacking. Specifically, stacking is used to form a linearcombination of finite mixture model and kernel density estimators fornon-parametric multivariate density estimation. The method outperforms other strategies such as choosing the single best modelbased on cross-validation, combining with uniform weights, and evenusing the single best model chosen by “Cheating” and examining thetest set. We also investigate (1) how the utility of stackingchanges when one of the models being combinedis the model that generated the data, (2) howthe stacking coefficients of the models compare to the relativefrequencies with which cross-validation chooses among the models,(3) visualization of combined “effective” kernels, and (4)the sensitivity of stacking to overfitting as model complexity increases.