Privacy-preserving SVM classification

  • Authors:
  • Jaideep Vaidya;Hwanjo Yu;Xiaoqian Jiang

  • Affiliations:
  • Rutgers University, Management Science and Information Systems Department, 07102, Newark, NJ, USA;University of Iowa, Department of Computer Science, 07102, Iowa City, IA, USA;University of Iowa, Department of Computer Science, 07102, Iowa City, IA, USA

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional Data Mining and Knowledge Discovery algorithms assume free access to data, either at a centralized location or in federated form. Increasingly, privacy and security concerns restrict this access, thus derailing data mining projects. What is required is distributed knowledge discovery that is sensitive to this problem. The key is to obtain valid results, while providing guarantees on the nondisclosure of data. Support vector machine classification is one of the most widely used classification methodologies in data mining and machine learning. It is based on solid theoretical foundations and has wide practical application. This paper proposes a privacy-preserving solution for support vector machine (SVM) classification, PP-SVM for short. Our solution constructs the global SVM classification model from data distributed at multiple parties, without disclosing the data of each party to others. Solutions are sketched out for data that is vertically, horizontally, or even arbitrarily partitioned. We quantify the security and efficiency of the proposed method, and highlight future challenges.