Privacy-preserving similarity-based text retrieval

  • Authors:
  • Hweehwa Pang;Jialie Shen;Ramayya Krishnan

  • Affiliations:
  • Singapore Management University;Singapore Management University;Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • ACM Transactions on Internet Technology (TOIT)
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Users of online services are increasingly wary that their activities could disclose confidential information on their business or personal activities. It would be desirable for an online document service to perform text retrieval for users, while protecting the privacy of their activities. In this article, we introduce a privacy-preserving, similarity-based text retrieval scheme that (a) prevents the server from accurately reconstructing the term composition of queries and documents, and (b) anonymizes the search results from unauthorized observers. At the same time, our scheme preserves the relevance-ranking of the search server, and enables accounting of the number of documents that each user opens. The effectiveness of the scheme is verified empirically with two real text corpora.