Linear classifier combination and selection using group sparse regularization and hinge loss

Authors:
Mehmet Umut ŞEn;Hakan Erdogan
Affiliations:
Vision and Pattern Analysis Laboratory, Sabanci University, Faculty of Engineering and Natural Sciences, Istanbul, Turkey;Vision and Pattern Analysis Laboratory, Sabanci University, Faculty of Engineering and Natural Sciences, Istanbul, Turkey
Venue:
Pattern Recognition Letters
Year:
2013

Citing 22
Cited 0

Original Contribution: Stacked generalization

Neural Networks
Combining the results of several neural network classifiers

Neural Networks
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Least Squares Support Vector Machine Classifiers

Neural Processing Letters
Optimal Linear Combination of Neural Networks for Improving Classification Performance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Is Combining Classifiers Better than Selecting the Best One

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
How to Make Stacking Better and Faster While Also Taking Care of an Unknown Weakness

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Stacking Bagged and Dagged Models

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
On the Learnability and Design of Output Codes for Multiclass Problems

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
A genetic encoding approach for learning methods for combining classifiers

Expert Systems with Applications: An International Journal
Regularized Linear Models in Stacked Generalization

MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Classification via group sparsity promoting regularization

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Issues in stacked generalization

Journal of Artificial Intelligence Research
GA-stacking: Evolutionary stacked generalization

Intelligent Data Analysis
Sparse ensembles using weighted combination methods based on linear programming

Pattern Recognition
A Unifying Framework for Learning the Linear Combiners for Classifier Ensembles

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Reranking for stacking ensemble learning

ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.10

Visualization

Abstract

The main principle of stacked generalization is using a second-level generalizer to combine the outputs of base classifiers in an ensemble. In this paper, after presenting a short survey of the literature on stacked generalization, we propose to use regularized empirical risk minimization (RERM) as a framework for learning the weights of the combiner which generalizes earlier proposals and enables improved learning methods. Our main contribution is using group sparsity for regularization to facilitate classifier selection. In addition, we propose and analyze using the hinge loss instead of the conventional least squares loss. We performed experiments on three different ensemble setups with differing diversities on 13 real-world datasets of various applications. Results show the power of group sparse regularization over the conventional l"1 norm regularization. We are able to reduce the number of selected classifiers of the diverse ensemble without sacrificing accuracy. With the non-diverse ensembles, we even gain accuracy on average by using group sparse regularization. In addition, we show that the hinge loss outperforms the least squares loss which was used in previous studies of stacked generalization.