Unboxed objects and polymorphic typing
POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Specifying representations of machine instructions
ACM Transactions on Programming Languages and Systems (TOPLAS)
Specifying the Semantics of Machine Instructions
IWPC '98 Proceedings of the 6th International Workshop on Program Comprehension
A system for generating static analyzers for machine instructions
CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
The BINCOA framework for binary code analysis
CAV'11 Proceedings of the 23rd international conference on Computer aided verification
Precise Static Analysis of Binaries by Extracting Relational Information
WCRE '11 Proceedings of the 2011 18th Working Conference on Reverse Engineering
A trustworthy monadic formalization of the ARMv7 instruction set architecture
ITP'10 Proceedings of the First international conference on Interactive Theorem Proving
GDSL: A Generic Decoder Specification Language for Interpreting Machine Language
Electronic Notes in Theoretical Computer Science (ENTCS)
Deriving a complete type inference for hindley-milner and vector sizes using expansion
PEPM '13 Proceedings of the ACM SIGPLAN 2013 workshop on Partial evaluation and program manipulation
Hi-index | 0.00 |
Any inspection, analysis or reverse engineering of binaries requires a translation of the program text into an intermediate representation (IR) that conveys the semantics of the program. To this end, we propose a domain specific language called GDSL (Generic Decoder Specification Language) that facilitates the translation from byte streams to instructions and from there to other intermediate representations. We present the GDSL toolkit, containing a compiler from GDSL to C, instruction decoders (currently for Intel x86 and Atmel AVR), translations to semantics, and optimizations of the semantics. Other processors, semantics and optimizations can be added, thereby providing a common platform for building frontends for the analysis of binaries. The emitted C code is human-readable and outperforms hand-written code such as the XED decoder shipped with the Intel Pin toolkit.