Extracting Output Formats from Executables

  • Authors:
  • Junghee Lim;Thomas Reps;Ben Liblit

  • Affiliations:
  • University of Wisconsin-Madison, USA;University of Wisconsin-Madison, USA;University of Wisconsin-Madison, USA

  • Venue:
  • WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the design and implementation of FFE/x86 (File-Format Extractor for x86), an analysis tool that works on stripped executables (i.e., neither source code nor debugging information need be available) and extracts output data formats, such as file formats and network packet formats. We first construct a Hierarchical Finite StateMachine (HFSM) that over-approximates the output data format. An HFSM defines a language over the operations used to generate output data. We use Value-Set Analysis (VSA) and Aggregate Structure Identification (ASI) to annotate HFSMs with information that partially characterizes some of the output data values. VSA determines an over-approximation of the set of addresses and integer values that each data object can hold at each program point, and ASI analyzes memory accesses in the program to recover information about the structure of aggregates. A series of filtering operations is performed to over-approximate an HFSM with a finite-state machine, which can result in a final answer that is easier to understand. Our experiments with FFE/x86 uncovered a possible bug in the image-conversion utility png2ico.