The set covering machine

  • Authors:
  • Mario Marchand;John Shawe Taylor

  • Affiliations:
  • School of Information Technology and Engineering, University of Ottawa, Ottawa, Ont. K1N-6N5, Canada;Department of Computer Science, Royal Holloway, University of London, Egham, TW20-0EX, UK

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We extend the classical algorithms of Valiant and Haussler for learning compact conjunctions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a general-purpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a version of the set covering machine that uses data-dependent balls for its set of features and compare its performance with the support vector machine. By extending a technique pioneered by Littlestone and Warmuth, we bound its generalization error as a function of the amount of data compression it achieves during training. In experiments with real-world learning tasks, the bound is shown to be extremely tight and to provide an effective guide for model selection.