Parallel randomized sampling for support vector machine (SVM) and support vector regression (SVR)

  • Authors:
  • Yumao Lu;Vwani Roychowdhury

  • Affiliations:
  • Yahoo! Inc., Applied Research, 94089, Sunnyvale, CA, USA;University of California, Department of Electrical Engineering, 94089, Los Angeles, CA, USA

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PRSVR have four major advantages over previous methods. (1) We prove that the proposed algorithms achieve an average convergence rate that is so far the fastest bounded convergence rate, among all SVM decomposition training algorithms to the best of our knowledge. The fast average convergence bound is achieved by a unique priority based sampling mechanism. (2) Unlike previous work (Provably fast training algorithm for support vector machines, 2001) the proposed algorithms work for general linear-nonseparable SVM and general non-linear SVR problems. This improvement is achieved by modeling new LP-type problems based on Karush–Kuhn–Tucker optimality conditions. (3) The proposed algorithms are the first parallel version of randomized sampling algorithms for SVM and SVR. Both the analytical convergence bound and the numerical results in a real application show that the proposed algorithm has good scalability. (4) We present demonstrations of the algorithms based on both synthetic data and data obtained from a real word application. Performance comparisons with SVMlight show that the proposed algorithms may be efficiently implemented.