N-gram based secure similar document detection

  • Authors:
  • Wei Jiang;Bharath K. Samanthula

  • Affiliations:
  • Department of Computer Science, Missouri S&T, Rolla, MO;Department of Computer Science, Missouri S&T, Rolla, MO

  • Venue:
  • DBSec'11 Proceedings of the 25th annual IFIP WG 11.3 conference on Data and applications security and privacy
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Secure similar document detection (SSDD) plays an important role in many applications, such as justifying the need-to-know basis and facilitating communication between government agencies. The SSDD problem considers situations where Alice with a query document wants to find similar information from Bob's document collection. During this process, the content of the query document is not disclosed to Bob, and Bob's document collection is not disclosed to Alice. Existing SSDD protocols are developed under the vector space model, which has the advantage of identifying global similar information. To effectively and securely detect similar documents with overlapping text fragments, this paper proposes a novel n-gram based SSDD protocol.