Statistical Disk Cluster Classification for File Carving

Authors:
Cor J. Veenman
Affiliations:
University of Amsterdam, the Netherlands/ Netherlands Forensic Institute, Netherlands
Venue:
IAS '07 Proceedings of the Third International Symposium on Information Assurance and Security
Year:
2007

Citing 0
Cited 10

Computational Forensics: An Overview

IWCF '08 Proceedings of the 2nd international workshop on Computational Forensics
On Improving the Accuracy and Performance of Content-Based File Type Identification

ACISP '09 Proceedings of the 14th Australasian Conference on Information Security and Privacy
Making sense of unstructured flash-memory dumps

Proceedings of the 2010 ACM Symposium on Applied Computing
An intelligent technique to detect file formats and e-mail spam

Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Classification of packet contents for malware detection

Journal in Computer Virology
Predicting the types of file fragments

Digital Investigation: The International Journal of Digital Forensics & Incident Response
Automated mapping of large binary objects using primitive fragment type classification

Digital Investigation: The International Journal of Digital Forensics & Incident Response
Using purpose-built functions and block hashes to enable small block and sub-file forensics

Digital Investigation: The International Journal of Digital Forensics & Incident Response
The Normalised Compression Distance as a file fragment classifier

Digital Investigation: The International Journal of Digital Forensics & Incident Response
An adaptive method to identify disk cluster size based on block content

Digital Investigation: The International Journal of Digital Forensics & Incident Response

Quantified Score

Hi-index	0.00

Visualization

Abstract

File carving is the process of recovering files from a disk without the help of a file system. In forensics, it is a helpful tool in finding hidden or recently removed disk content. Known signatures in file headers and footers are especially useful in carving such files out, that is, from header until footer. However, this approach assumes that file clusters remain in order. In case of file fragmentation, file clusters can be disconnected and the order can even be disrupted such that straighforward carving will fail. In this paper, we focus on methods for classifying clusters into file types by using the statistics of the clusters. By not exploiting the possible embedded signatures, we generate evidence from a different source that can be integrated lateron. We propose a set of characteristic features and use statistical pattern recognition to learn a supervised classification model for a range of relevant file types. We exploit the statistics of a restricted number of neighboring clusters (context) to improve classification performance. In the experiments we show that the proposed features indeed enable the differentation of clusters into file types. Moreover, for some file types the incorporation of cluster context improves the recognition performance significantly.