Original Contribution: Stacked generalization
Neural Networks
Combining the results of several neural network classifiers
Neural Networks
The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
Least Squares Support Vector Machine Classifiers
Neural Processing Letters
Optimal Linear Combination of Neural Networks for Improving Classification Performance
IEEE Transactions on Pattern Analysis and Machine Intelligence
Is Combining Classifiers Better than Selecting the Best One
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
How to Make Stacking Better and Faster While Also Taking Care of an Unknown Weakness
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Stacking Bagged and Dagged Models
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Ensemble Methods in Machine Learning
MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
On the Learnability and Design of Output Codes for Multiclass Problems
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
A genetic encoding approach for learning methods for combining classifiers
Expert Systems with Applications: An International Journal
Regularized Linear Models in Stacked Generalization
MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Classification via group sparsity promoting regularization
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Issues in stacked generalization
Journal of Artificial Intelligence Research
GA-stacking: Evolutionary stacked generalization
Intelligent Data Analysis
Sparse ensembles using weighted combination methods based on linear programming
Pattern Recognition
A Unifying Framework for Learning the Linear Combiners for Classifier Ensembles
ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Reranking for stacking ensemble learning
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Hi-index | 0.10 |
The main principle of stacked generalization is using a second-level generalizer to combine the outputs of base classifiers in an ensemble. In this paper, after presenting a short survey of the literature on stacked generalization, we propose to use regularized empirical risk minimization (RERM) as a framework for learning the weights of the combiner which generalizes earlier proposals and enables improved learning methods. Our main contribution is using group sparsity for regularization to facilitate classifier selection. In addition, we propose and analyze using the hinge loss instead of the conventional least squares loss. We performed experiments on three different ensemble setups with differing diversities on 13 real-world datasets of various applications. Results show the power of group sparse regularization over the conventional l"1 norm regularization. We are able to reduce the number of selected classifiers of the diverse ensemble without sacrificing accuracy. With the non-diverse ensembles, we even gain accuracy on average by using group sparse regularization. In addition, we show that the hinge loss outperforms the least squares loss which was used in previous studies of stacked generalization.