IRILD: An Information Retrieval Based Method for Information Leak Detection

  • Authors:
  • Eleni Gessiou;Quang Hieu Vu;Sotiris Ioannidis

  • Affiliations:
  • -;-;-

  • Venue:
  • EC2ND '11 Proceedings of the 2011 Seventh European Conference on Computer Network Defense
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The traditional approach for detecting information leaks is to generate fingerprints of sensitive data, by partitioning and hashing it, and then comparing these fingerprints against outgoing documents. Unfortunately, this approach incurs a high computation cost as every part of document needs to be checked. As a result, it is not applicable to systems with a large number of documents that need to be protected. Additionally, the approach is prone to false positives if the fingerprints are common phrases. In this paper, we propose an improvement for this approach to offer a much faster processing time with less false positives. The core idea of our solution is to eliminate common phrases and non-sensitive phrases from the fingerprinting process. Non-sensitive phrases are identified by looking at available public documents of the organization that we want to protect from information leaks and common phrases are identified with the help of a search engine. In this way, our solution both accelerates leak detection and increases the accuracy of the result. Experiments were conducted on real-world data to prove the efficiency and effectiveness of the proposed solution.