Privacy-Preserving Evaluation of Generalization Error and Its Application to Model and Attribute Selection

  • Authors:
  • Jun Sakuma;Rebecca N. Wright

  • Affiliations:
  • University of Tsukuba, Tsukuba, Japan 305-8577;Rutgers University, Piscataway, USA 08854

  • Venue:
  • ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Privacy-preserving classification is the task of learning or training a classifier on the union of privately distributed datasets without sharing the datasets. The emphasis of existing studies in privacy-preserving classification has primarily been put on the design of privacy-preserving versions of particular data mining algorithms, However, in classification problems, preprocessing and postprocessing-- such as model selection or attribute selection--play a prominent role in achieving higher classification accuracy. In this paper, we show generalization error of classifiers in privacy-preserving classification can be securely evaluated without sharing prediction results. Our main technical contribution is a new generalized Hamming distance protocol that is universally applicable to preprocessing and postprocessing of various privacy-preserving classification problems, such as model selection in support vector machine and attribute selection in naive Bayes classification.