Native ×86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring

Authors:
Edward J. Schwartz;JongHyup Lee;Maverick Woo;David Brumley
Affiliations:
Carnegie Mellon University;Korea National University of Transportation;Carnegie Mellon University;Carnegie Mellon University
Venue:
SEC'13 Proceedings of the 22nd USENIX conference on Security
Year:
2013

Citing 27
Cited 1

Decompilation of binary programs

Software—Practice & Experience
Advanced compiler design and implementation

Advanced compiler design and implementation
An Algorithm for Structuring Flowgraphs

Journal of the ACM (JACM)
On the capabilities of while, repeat, and exit statements

Communications of the ACM
Letters to the editor: go to statement considered harmful

Communications of the ACM
The Definition of Standard ML

The Definition of Standard ML
CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Decompiling Java Bytecode: Problems, Traps and Pitfalls

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Control flow analysis

Proceedings of a symposium on Compiler optimization
Global common subexpression elimination

Proceedings of a symposium on Compiler optimization
Assembly to High-Level Language Translation

ICSM '98 Proceedings of the International Conference on Software Maintenance
Compilers: Principles, Techniques, and Tools (2nd Edition)

Compilers: Principles, Techniques, and Tools (2nd Edition)
Statically detecting likely buffer overflow vulnerabilities

SSYM'01 Proceedings of the 10th conference on USENIX Security Symposium - Volume 10
Static disassembly of obfuscated binaries

SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
Wysinwyx: what you see is not what you execute

Wysinwyx: what you see is not what you execute
Jakstab: A Static Analysis Platform for Binaries

CAV '08 Proceedings of the 20th international conference on Computer Aided Verification
Efficient and extensible security enforcement using dynamic data flow analysis

Proceedings of the 15th ACM conference on Computer and communications security
A few billion lines of code later: using static analysis to find bugs in the real world

Communications of the ACM
Refinement-based CFG reconstruction from unstructured programs

VMCAI'11 Proceedings of the 12th international conference on Verification, model checking, and abstract interpretation
Reconstruction of Class Hierarchies for Decompilation of C++ Programs

CSMR '10 Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering
Enhanced structural analysis for C code reconstruction from IR code

Proceedings of the 14th International Workshop on Software and Compilers for Embedded Systems
BAP: a binary analysis platform

CAV'11 Proceedings of the 23rd international conference on Computer aided verification
SmartDec: Approaching C++ Decompilation

WCRE '11 Proceedings of the 2011 18th Working Conference on Reverse Engineering
Analysis of low-level code using cooperating decompilers

SAS'06 Proceedings of the 13th international conference on Static Analysis
Minemu: the world's fastest taint tracker

RAID'11 Proceedings of the 14th international conference on Recent Advances in Intrusion Detection
Structural analysis: A new approach to flow analysis in optimizing compilers

Computer Languages
Improving integer security for systems with KINT

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation

Obfuscation resilient binary code reuse through trace-oriented programming

Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security

Quantified Score

Hi-index	0.00

Visualization

Abstract

There are many security tools and techniques for analyzing software, but many of them require access to source code. We propose leveraging decompilation, the study of recovering abstractions from compiled code, to apply existing source-based tools and techniques to compiled programs. A decompiler should focus on two properties to be used for security. First, it should recover abstractions as much as possible to minimize the complexity that must be handled by the security analysis that follows. Second, it should aim to recover these abstractions correctly. Previous work in control-flow structuring, an abstraction recovery problem used in decompilers, does not provide either of these properties. Specifically, existing structuring algorithms are not semantics-preserving, which means that they cannot safely be used for decompilation without modification. Existing structural algorithms also miss opportunities for recovering control flow structure. We propose a new structuring algorithm in this paper that addresses these problems. We evaluate our decompiler, Phoenix, and our new structuring algorithm, on a set of 107 real world programs from GNU coreutils. Our evaluation is an order of magnitude larger than previous systematic studies of endto-end decompilers. We show that our decompiler outperforms the de facto industry standard decompiler Hex-Rays in correctness by 114%, and recovers 30× more control-flow structure than existing structuring algorithms in the literature.