Learning the set covering machine by bound minimization and margin-sparsity trade-off

  • Authors:
  • François Laviolette;Mario Marchand;Mohak Shah;Sara Shanian

  • Affiliations:
  • Department of Computer Science and Software Engineering, Pav. Adrien Pouliot, Laval University, Quebec, Canada G1V-0A6;Department of Computer Science and Software Engineering, Pav. Adrien Pouliot, Laval University, Quebec, Canada G1V-0A6;Centre for Intelligent Machines, McGill University, Montreal, Canada H3A-2A7;Department of Computer Science and Software Engineering, Pav. Adrien Pouliot, Laval University, Quebec, Canada G1V-0A6

  • Venue:
  • Machine Learning
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate classifiers in the sample compression framework that can be specified by two distinct sources of information: a compression set and a message string of additional information. In the compression setting, a reconstruction function specifies a classifier when given this information. We examine how an efficient redistribution of this reconstruction information can lead to more general classifiers. In particular, we derive risk bounds that can provide an explicit control over the sparsity of the classifier and the magnitude of its separating margin and a capability to perform a margin-sparsity trade-off in favor of better classifiers. We show how an application to the set covering machine algorithm results in novel learning strategies. We also show that these risk bounds are tighter than their traditional counterparts such as VC-dimension and Rademacher complexity-based bounds that explicitly take into account the hypothesis class complexity. Finally, we show how these bounds are able to guide the model selection for the set covering machine algorithm enabling it to learn by bound minimization.