Performance-oriented privacy-preserving data integration

  • Authors:
  • Raymond K. Pon;Terence Critchlow

  • Affiliations:
  • UCLA Computer Science Department, Los Angeles, California;Lawrence Livermore National Laboratory, Livermore, California

  • Venue:
  • DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current solutions to integrating private data with public data have provided useful privacy metrics, such as relative information gain, that can be used to evaluate alternative approaches. Unfortunately, they have not addressed critical performance issues, especially when the public database is very large. The use of hashes and noise yields better performance than existing techniques, while still making it difficult for unauthorized entities to distinguish which data items truly exist in the private database. As we show here, the uncertainty introduced by collisions caused by hashing and the injection of noise can be leveraged to perform a privacy-preserving relational join operation between a massive public table and a relatively smaller private one.