Application of q-Gram Distance in Digital Forensic Search

  • Authors:
  • Slobodan Petrović;Sverre Bakke

  • Affiliations:
  • NISlab, Department of Computer Science and Media Technology, Gjøvik University College, Gjøvik, Norway 2802;NISlab, Department of Computer Science and Media Technology, Gjøvik University College, Gjøvik, Norway 2802

  • Venue:
  • IWCF '08 Proceedings of the 2nd international workshop on Computational Forensics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order to find evidence, digital forensic investigation often includes search procedures applied on large data sets. For such search procedures, appropriate fault tolerant distance measures are needed in order to detect evidence even if it has been previously distorted/partially erased from the search media. One of the appropriate fault-tolerant distance measures for this purpose is constrained edit distance, where the maximum numbers of consecutive insertions and deletions represent the constraints. However, the time complexity of its computation is too high. We propose a two-phase indexless search procedure for application in forensic evidence search that makes use of q-gram distance instead of the constrained edit distance. The q-gram distance is known to approximate well the unconstrainededit distance. We study how well q-gram distance approximates edit distance with special constraints needed in forensic search applications. We compare the performances of the search procedure with the two distances applied in it. Experimental results show that the procedure with the q-gram distance implemented achieves for some values of qalmost the same accuracy as the one with the constrained edit distance, but the efficiency of the procedure that implements the q-gram distance is much better, for a much lower time complexity of computation of the q-gram distance.