A new algorithm for identifying loops in decompilation

Authors:
Tao Wei;Jian Mao;Wei Zou;Yu Chen
Affiliations:
Institute of Computer Science and Technology, Peking University;Institute of Computer Science and Technology, Peking University;Institute of Computer Science and Technology, Peking University;Institute of Computer Science and Technology, Peking University
Venue:
SAS'07 Proceedings of the 14th international conference on Static Analysis
Year:
2007

Citing 9
Cited 2

Identifying loops using DJ graphs

ACM Transactions on Programming Languages and Systems (TOPLAS)
Nesting of reducible and irreducible loops

ACM Transactions on Programming Languages and Systems (TOPLAS)
Advanced compiler design and implementation

Advanced compiler design and implementation
Identifying loops in almost linear time

ACM Transactions on Programming Languages and Systems (TOPLAS)
Introduction to algorithms

Introduction to algorithms
On loops, dominators, and dominance frontiers

ACM Transactions on Programming Languages and Systems (TOPLAS)
Control flow analysis

Proceedings of a symposium on Compiler optimization
Global common subexpression elimination

Proceedings of a symposium on Compiler optimization
Testing flow graph reducibility

Journal of Computer and System Sciences

COSTA: Design and Implementation of a Cost and Termination Analyzer for Java Bytecode

Formal Methods for Components and Objects
Task-level analysis for a language with async/finish parallelism

Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Loop identification is an essential step of control flow analysis in decompilation. The Classical algorithm for identifying loops is Tarjan's intervalfinding algorithm, which is restricted to reducible graphs. Havlak presents one extension of Tarjan's algorithm to deal with irreducible graphs, which constructs a loop-nesting forest for an arbitrary flow graph. There's evidence showing that the running time of this algorithm is quadratic in the worst-case, and not almost linear as claimed. Ramalingam presents an improved algorithm with low time complexity on arbitrary graphs, but it performs not quite well on "real" control flow graphs (CFG). We present a novel algorithm for identifying loops in arbitrary CFGs. Based on a more detailed exploration on properties of loops and depth-first search (DFS), this algorithm traverses a CFG only once based on DFS and collects all information needed on the fly. It runs in approximately linear time and does not use any complicated data structures such as Interval/Derived Sequence of Graphs (DSG) or UNION-FIND sets. To perform complexity analysis of the algorithm, we introduce a new concept called unstructuredness coefficient to describe the unstructuredness of CFGs, and we find that the unstructuredness coefficients of these executables are usually small (