Who wrote this code? identifying the authors of program binaries

  • Authors:
  • Nathan Rosenblum;Xiaojin Zhu;Barton P. Miller

  • Affiliations:
  • University of Wisconsin, Madison, Wisconsin;University of Wisconsin, Madison, Wisconsin;University of Wisconsin, Madison, Wisconsin

  • Venue:
  • ESORICS'11 Proceedings of the 16th European conference on Research in computer security
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Program authorship attribution--identifying a programmer based on stylistic characteristics of code--has practical implications for detecting software theft, digital forensics, and malware analysis. Authorship attribution is challenging in these domains where usually only binary code is available; existing source code-based approaches to attribution have left unclear whether and to what extent programmer style survives the compilation process. Casting authorship attribution as a machine learning problem, we present a novel program representation and techniques that automatically detect the stylistic features of binary code. We apply these techniques to two attribution problems: identifying the precise author of a program, and finding stylistic similarities between programs by unknown authors. Our experiments provide strong evidence that programmer style is preserved in program binaries.