Monoids for rapid data flow analysis

Authors:
Barry K. Rosen
Affiliations:
IBM Thomas J. Watson Research Center, Yorktown Heights, New York
Venue:
POPL '78 Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Year:
1978

Citing 17
Cited 5

Global Data Flow Analysis and Iterative Algorithms

Journal of the ACM (JACM)
A Fast and Usually Linear Algorithm for Global Flow Analysis

Journal of the ACM (JACM)
Program Improvement by Source-to-Source Transformation

Journal of the ACM (JACM)
Structured Programming with go to Statements

ACM Computing Surveys (CSUR)
High-level data flow analysis

Communications of the ACM
A case study of a new code generation technique for compilers

Communications of the ACM
A program data flow analysis procedure

Communications of the ACM
A genealogy of control structures

Communications of the ACM
A unified approach to global program optimization

POPL '73 Proceedings of the 1st annual ACM SIGACT-SIGPLAN symposium on Principles of programming languages
A new strategy for code generation: the general purpose optimizing compiler

POPL '77 Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Applications of high level control flow

POPL '77 Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints

POPL '77 Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
The Design of an Optimizing Compiler

The Design of an Optimizing Compiler
A control statement for natural top-down structured programming

Programming Symposium, Proceedings Colloque sur la Programmation
Solving path problems on directed graphs.

Solving path problems on directed graphs.
Iterative algorithms for global flow analysis

Iterative algorithms for global flow analysis
Structured programming

Structured programming

Data Flow Analysis for Procedural Languages

Journal of the ACM (JACM)
Program analysis with partial transfer functions

PEPM '00 Proceedings of the 2000 ACM SIGPLAN workshop on Partial evaluation and semantics-based program manipulation
Qualified data flow problems

POPL '80 Proceedings of the 7th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Systematic design of program analysis frameworks

POPL '79 Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Stickiness and liveness

ACM SIGPLAN Notices

Quantified Score

Hi-index	0.00

Visualization

Abstract

The earliest data flow analysis research dealt with concreteproblems (such as detection of available expressions) and with lowlevel representations of control flow (with one large graph, eachof whose nodes represents a basic block). Several recent papershave introduced an abstract approach, dealing with any problemexpressible in terms of a semilattice L and a monoid M of isotonemaps from L to L, under various algebraic constraints. Examplesinclude [CC77; GW76; KU76; Ki73; Ta75; Ta76; We75]. Several otherrecent papers have introduced a high level representation with manysmall graphs, each of which represents a small portion of thecontrol flow information in a program. The hierarchy of smallgraphs is explicit in [Ro77a; Ro77b] and implicit in papers thatdeal with syntax directed analysis of programs written within theconfines of classical structured programming [DDH72, Sec. 1.7].Examples include [TK76; ZB74]. The abstract papers have retainedthe low level representations while the high level papers haveretained the concrete problems of the earliest work. This paperstudies abstract conditions on L and M that lead to rapid data flowanalysis, with emphasis on high level representations. Unlike someanalysis methods oriented toward structured programming [TK76;Wu75; ZB74], our method retains the ability to cope with arbitraryescape and jump statements while it exploits the control flowinformation implicit in the parse tree.The general algebraic framework for data flow analysis withsemilattices is presented in Section 2, along with some preliminarylemmas. Our "rapid" monoids properly include the "fast" monoids of[GW76]. Section 3 relates data flow problems to the hierarchies ofsmall graphs introduced in [Ro77a; Ro77b]. High level analysisbegins with local information expressed by mapping the arcs of alarge graph into the monoid M, much as in low level analysis. Buteach arc in our small graphs represents a set (often an infiniteset) of paths in the underlying large graph. Appropriate members ofM are associated with these arcs. This "globalized" localinformation is used to solve global flow problems in Section 4. Thefundamental theorem of Section 4 is applied to programs with thecontrol structures of classical structured programming in Section5. For a given rapid monoid M, the time required to solve anyglobal data flow problem is linear in the number of statements inthe program. (For varying M, the time is linear in the product ofthis number by t@, where t@ is a parameter ofM introduced in the definition of rapidity.) For reasons sketchedat the beginning of Section 6, we feel obliged to cope with sourcelevel escape and jump statements as well as with classicalstructured programming. Section 6 shows how to apply thefundamental theorem of Section 4 to programs with arbitrary escapesand jumps. The explicit time bound is only derived for programswithout jumps. A comparison between the results obtained by ourmethod and those obtained by [GW76] is in Section 7, which alsocontains examples of rapid monoids in the full paper. Finally,Section 8 lists conclusions and open problems. Proofs of lemmas areomitted to save space. The full paper will resubmitted to ajournal.We proceed from the general to the particular, except in someplaces where bending the rule a little makes a significantimprovement in the expository flow. Common mathematical notation isused. To avoid excessive parentheses, the value of a function f atan argument x is fx rather than f(x). If fx is itself a functionthen (fx)y is the result of applying fx to y. The usual¡Ü and ¡Ý symbols are used for arbitrarypartial orders as well as for the usual order among integers. Afunction from a partially ordered set (poset) to a poset isisotone iff x ¡Ü y implies fx ¡Ü fy.(Isotone maps are sometimes called "monotonic" in the literature.)A meet semilattice is a poset with a binary operation¡Ä such that x ¡Ä y is the greatest lowerbound of the set {x, y}. A meet semilattice wherein every subsethas a greatest lower bound is complete. In particular, theempty subset has a greatest lower bound T, so a complete meetsemilattice has a maximum element. A monoid is a settogether with an associative binary operation ∘ that hasa unit element 1 : 1 ∘ m = m ∘1 = m for all m. In all our examples the monoid M will be amonoid of functions: every member of M is a function (from aset into itself), the operation ∘ is the usualcomposition (f ∘ g)x = f(gx), and the unit 1 isthe identity function with 1X = x for all x. Twoconsiderations governed the notational choices. First, we speak inways that are common in mathematics and are convenient here.Second, we try to facilitate comparisons with [GW76; KU76; Ro77b],to the extent that the disparities among these works permit. Onedisparity is between the meet semilattices of [GW76; KU76; Ki73]and the join semilattices of [Ro77b; Ta75; We75], whereleast upper bounds are considered instead of greatest lower bounds.To speak of meets is more natural in applications that areintuitively stated in terms of "what must happen on all paths" insome class of paths in a program, while to speak of joins is morenatural in applications that are intuitively stated in terms of"what can happen on some paths." By checking whether there are anypaths in the relevant class and by using the rule that 3 isequivalent to ⌍V⌍, join oriented applicationscan be reduced to meet oriented ones (and vice versa). A generaltheory should speak in one way or the other, and we have chosenmeets. For us, strong assertions about a program's data flow arehigh in the semilattice.