Labeling library functions in stripped binaries

  • Authors:
  • Emily R. Jacobson;Nathan Rosenblum;Barton P. Miller

  • Affiliations:
  • University of Wisconsin, Madison, WI, USA;University of Wisconsin, Madison, WI, USA;University of Wisconsin, Madison, WI, USA

  • Venue:
  • Proceedings of the 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Binary code presents unique analysis challenges, particularly when debugging information has been stripped from the executable. Among the valuable information lost in stripping are the identities of standard library functions linked into the executable; knowing the identities of such functions can help to optimize automated analysis and is instrumental in understanding program behavior. Library fingerprinting attempts to restore the names of library functions in stripped binaries, using signatures extracted from reference libraries. Existing methods are brittle in the face of variations in the toolchain that produced the reference libraries and do not generalize well to new library versions. We introduce semantic descriptors, high-level representations of library functions that avoid the brittleness of existing approaches. We have extended a tool, unstrip, to apply this technique to fingerprint wrapper functions in the GNU C library. unstrip discovers functions in a stripped binary and outputs a new binary, with meaningful names added to the symbol table. Other tools can leverage these symbols to perform further analysis. We demonstrate that our semantic descriptors generalize well and substantially outperform existing library fingerprinting techniques.