Revised Loss Bounds for the Set Covering Machine and Sample-Compression Loss Bounds for Imbalanced Data

Authors:
Zakria Hussain;François Laviolette;Mario Marchand;John Shawe-Taylor;Spencer Charles Brubaker;Matthew D. Mullin
Affiliations:
-;-;-;-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2007

Citing 5
Cited 0

Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension

Machine Learning
Learning with the Set Covering Machine

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The set covering machine

The Journal of Machine Learning Research
Tutorial on Practical Prediction Theory for Classification

The Journal of Machine Learning Research
Learning with Decision Lists of Data-Dependent Features

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.02

Visualization

Abstract

Marchand and Shawe-Taylor (2002) have proposed a loss bound for the set covering machine that has the property to depend on the observed fraction of positive examples and on what the classifier achieves on the positive training examples. We show that this loss bound is incorrect. We then propose a loss bound, valid for any sample-compression learning algorithm (including the set covering machine), that depends on the observed fraction of positive examples and on what the classifier achieves on them. We also compare numerically the loss bound proposed in this paper with the incorrect bound, the original SCM bound and a recently proposed loss bound of Marchand and Sokolova (2005) (which does not depend on the observed fraction of positive examples) and show that the latter loss bounds can be substantially larger than the new bound in the presence of imbalanced misclassifications.