SÁDI - Statistical Analysis for Data Type Identification

  • Authors:
  • Sarah J. Moody;Robert F. Erbacher

  • Affiliations:
  • -;-

  • Venue:
  • SADFE '08 Proceedings of the 2008 Third International Workshop on Systematic Approaches to Digital Forensic Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A key task in digital forensic analysis is the location of relevant information within the computer system. Identification of the relevancy of data is often dependent upon the identification of the type of data being examined. Typical file type identification is based upon file extension or magic keys. These typical techniques fail in many typical forensic analysis scenarios such as needing to deal with embedded data, such as with Microsoft Word files, or file fragments. The SÁDI (Statistical Analysis Data Identification) technique applies statistical analysis of the byte values of the data in such a way that the accuracy of the technique does not rely on the potentially misleading metadata information but rather the values of the data itself. The development of SÁDI provides the capability to identify what digitally stored data actually represents and will also allow for the selective extraction of portions of the data for additional investigation; i.e., in the case of embedded data. Thus, our research provides a more effective type identification technique that does not fail on file fragments, embedded data types, or with obfuscated data.