Datum-wise classification: a sequential approach to sparsity

  • Authors:
  • Gabriel Dulac-Arnold;Ludovic Denoyer;Philippe Preux;Patrick Gallinari

  • Affiliations:
  • Université Pierre et Marie Curie, UPMC, Paris, LIP6, Case 169 France;Université Pierre et Marie Curie, UPMC, Paris, LIP6, Case 169 France;LIFL (UMR CNRS) & INRIA Lille Nord-Europe, Université de Lille, France;Université Pierre et Marie Curie, UPMC, Paris, LIP6, Case 169 France

  • Venue:
  • ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel classification technique whose aim is to select an appropriate representation for each datapoint, in contrast to the usual approach of selecting a representation encompassing the whole dataset. This datum-wise representation is found by using a sparsity inducing empirical risk, which is a relaxation of the standard L0 regularized risk. The classification problem is modeled as a sequential decision process that sequentially chooses, for each datapoint, which features to use before classifying. Datum-Wise Classification extends naturally to multi-class tasks, and we describe a specific case where our inference has equivalent complexity to a traditional linear classifier, while still using a variable number of features. We compare our classifier to classical L1 regularized linear models (L1-SVM and LARS) on a set of common binary and multi-class datasets and show that for an equal average number of features used we can get improved performance using our method.