Distributed support vector machines

  • Authors:
  • A. Navia-Vazquez;D. Gutierrez-Gonzalez;E. Parrado-Hernandez;J. J. Navarro-Abellan

  • Affiliations:
  • Dept. of Signal Theor. & Commun., Univ. Carlos III de Madrid, Spain;-;-;-

  • Venue:
  • IEEE Transactions on Neural Networks
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

A truly distributed (as opposed to parallelized) support vector machine (SVM) algorithm is presented. Training data are assumed to come from the same distribution and are locally stored in a number of different locations with processing capabilities (nodes). In several examples, it has been found that a reasonably small amount of information is interchanged among nodes to obtain an SVM solution, which is better than that obtained when classifiers are trained only with the local data and comparable (although a little bit worse) to that of the centralized approach (obtained when all the training data are available at the same place). We propose and analyze two distributed schemes: a "naïve" distributed chunking approach, where raw data (support vectors) are communicated, and the more elaborated distributed semiparametric SVM, which aims at further reducing the total amount of information passed between nodes while providing a privacy-preserving mechanism for information sharing. We show the feasibility of our proposal by evaluating the performance of the algorithms in benchmarks with both synthetic and real-world datasets.