md5bloom: Forensic filesystem hashing revisited

Authors:
Vassil Roussev;Yixin Chen;Timothy Bourg;Golden G. Richard, III
Affiliations:
Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA;Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA;Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA;Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA
Venue:
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Year:
2006

Citing 10
Cited 1

OPUS: preventing weak password choices

Computers and Security
Join and Semijoin Algorithms for a Multiprocessor Database Machine

ACM Transactions on Database Systems (TODS)
Summary cache: a scalable wide-area web cache sharing protocol

IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
New directions in traffic measurement and accounting

IMW '01 Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement
Payload attribution via hierarchical bloom filters

Proceedings of the 11th ACM conference on Computer and communications security
Self-organization in peer-to-peer systems

EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop
Longest prefix matching using bloom filters

IEEE/ACM Transactions on Networking (TON)
Finding similar files in a large file system

WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Block-level security for network-attached disks

FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies

Effective whitelisting for filesystem forensics

ISI'09 Proceedings of the 2009 IEEE international conference on Intelligence and security informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hashing is a fundamental tool in digital forensic analysis used both to ensure data integrity and to efficiently identify known data objects. However, despite many years of practice, its basic use has advanced little. Our objective is to leverage advanced hashing techniques in order to improve the efficiency and scalability of digital forensic analysis. Specifically, we explore the use of Bloom filters as a means to efficiently aggregate and search hashing information. In this paper, we present md5bloom-an actual Bloom filter manipulation tool that can be incorporated into forensic practice, along with example uses and experimental results. We also provide a basic theoretical foundation, which quantifies the error rates associated with the various Bloom filter uses along with a simulation-based verification. We provide a probabilistic framework that allows the interpretation of direct, bitwise comparison of Bloom filters to infer similarity and abnormality. Using the similarity interpretation, it is possible to efficiently identify versions of a known object, whereas the notion of abnormality could aid in identifying tampered hash sets.