Private multiparty sampling and approximation of vector combinations

  • Authors:
  • Yuval Ishai;Tal Malkin;Martin J. Strauss;Rebecca N. Wright

  • Affiliations:
  • Computer Science Department, Technion, Haifa 32000, Israel;Department of Computer Science, Columbia University, New York, NY 10025, USA;Departments of Math and EECS, University of Michigan, Ann Arbor, MI 48109, USA;Computer Science Department and DIMACS, Rutgers University, Piscataway, NJ 08854, USA

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2009

Quantified Score

Hi-index 5.23

Visualization

Abstract

We consider the problem of private efficient data mining of vertically-partitioned databases. Each of several parties holds a column of a data matrix (a vector) and the parties want to investigate the componentwise combination of their vectors. The parties want to minimize communication and local computation while guaranteeing privacy in the sense that no party learns more than necessary. Sublinear-communication private protocols have primarily been studied only in the two-party case. In contrast, this work focuses on multi-party settings. First, we give efficient private multiparty protocols for sampling a row of the data matrix and for computing arbitrary functions of a random row, where the row index is additively shared among two or more parties. These results can be used to obtain private approximation protocols for several useful combination functionalities. Moreover, these results have some interesting consequences for the general problem of reducing sublinear-communication secure multiparty computation to two-party private information retrieval (PIR). Second, we give protocols for computing approximations (summaries) of the componentwise sum, minimum, and maximum of the columns. Here, while providing a weaker privacy guarantee (where the approximation may leak up to the entire output vector), our protocols are extremely efficient. In particular, the required cryptographic overhead (compared to non-private solutions) is polylogarithmic in the number of rows.