The set covering machine

Authors:
Mario Marchand;John Shawe Taylor
Affiliations:
School of Information Technology and Engineering, University of Ottawa, Ottawa, Ont. K1N-6N5, Canada;Department of Computer Science, Royal Holloway, University of London, Egham, TW20-0EX, UK
Venue:
The Journal of Machine Learning Research
Year:
2003

Citing 11
Cited 16

A theory of the learnable

Communications of the ACM
Quantifying inductive bias: AI learning algorithms and Valiant's learning framework

Artificial Intelligence
An introduction to computational learning theory

An introduction to computational learning theory
Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension

Machine Learning
Combinatorial variability of Vapnik-Chervonenkis classes with applications to sample compression schemes

Discrete Applied Mathematics - Special issue: Vapnik-Chervonenkis dimension
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Learning Decision Lists

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Algorithmic luckiness

The Journal of Machine Learning Research
Structural risk minimization over data-dependent hierarchies

IEEE Transactions on Information Theory

Applying data mining to software maintenance records

CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
On the Generalization Ability of GRLVQ Networks

Neural Processing Letters
PAC-Bayes risk bounds for sample-compressed Gibbs classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
Linear programming minimum sphere set covering for extreme learning machines

Neurocomputing
Using support vector machines in data mining

ISTASC'04 Proceedings of the 4th WSEAS International Conference on Systems Theory and Scientific Computation
Revised Loss Bounds for the Set Covering Machine and Sample-Compression Loss Bounds for Imbalanced Data

The Journal of Machine Learning Research
Optimum Neural Network Construction Via Linear Programming Minimum Sphere Set Covering

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
A new maximal-margin spherical-structured multi-class support vector machine

Applied Intelligence
A systematic analysis of performance measures for classification tasks

Information Processing and Management: an International Journal
ENDER: a statistical framework for boosting decision rules

Data Mining and Knowledge Discovery
Fast and Scalable Local Kernel Machines

The Journal of Machine Learning Research
CHIRP: a new classifier based on composite hypercubes on iterated random projections

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Margin-sparsity trade-off for the set covering machine

ECML'05 Proceedings of the 16th European conference on Machine Learning
Unlabeled compression schemes for maximum classes

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Pattern classification via single spheres

DS'05 Proceedings of the 8th international conference on Discovery Science
Learning and feature selection using the set covering machine with data-dependent rays on gene expression profiles

ANNPR'06 Proceedings of the Second international conference on Artificial Neural Networks in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

We extend the classical algorithms of Valiant and Haussler for learning compact conjunctions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a general-purpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a version of the set covering machine that uses data-dependent balls for its set of features and compare its performance with the support vector machine. By extending a technique pioneered by Littlestone and Warmuth, we bound its generalization error as a function of the amount of data compression it achieves during training. In experiments with real-world learning tasks, the bound is shown to be extremely tight and to provide an effective guide for model selection.