Privacy-preserving indexing of documents on the network

  • Authors:
  • Mayank Bawa;Roberto J. Bayardo, Jr.;Rakesh Agrawal

  • Affiliations:
  • Stanford University, Stanford, CA;IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of providing privacy-preserving search over distributed access-controlled content. Indexed documents can be easily reconstructed from conventional (inverted) indexes used in search. The need to avoid breaches of access-control through the index requires the index hosting site to be fully secured and trusted by by all participating content providers. This level of trust is impractical in the increasingly common case where multiple competing organizations or individuals wish to selectively share content. We propose a solution that eliminates the need of such a trusted authority. The solution builds a centralized privacy-preserving index in conjunction with a distributed access-control enforcing search protocol. The new index provides strong and quantifiable privacy guarantees that hold even if the entire index is made public. Experiments on a real-life dataset validate performance of the scheme. The appeal of our solution is two-fold: (a) Content providers maintain complete control in defining access groups and ensuring its compliance, and (b) System implementors retain tunable knobs to balance privacy and efficiency concerns for their particular domains.