Learning Kernel-Based Halfspaces with the 0-1 Loss

  • Authors:
  • Shai Shalev-Shwartz;Ohad Shamir;Karthik Sridharan

  • Affiliations:
  • shais@cs.huji.ac.il;ohadsh@microsoft.com;karthik@ttic.edu

  • Venue:
  • SIAM Journal on Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe and analyze a new algorithm for agnostically learning kernel-based halfspaces with respect to the 0-1 loss function. Unlike most of the previous formulations, which rely on surrogate convex loss functions (e.g., hinge-loss in support vector machines (SVMs) and log-loss in logistic regression), we provide finite time/sample guarantees with respect to the more natural 0-1 loss function. The proposed algorithm can learn kernel-based halfspaces in worst-case time poly$(\exp(L\log(L/\epsilon)))$, for any distribution, where $L$ is a Lipschitz constant (which can be thought of as the reciprocal of the margin), and the learned classifier is worse than the optimal halfspace by at most $\epsilon$. We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn kernel-based halfspaces in time polynomial in $L$.