Using an oracle to measure potential parallelism in single instruction stream programs

Authors:
Alexandru Nicolau;Joseph A. Fisher
Affiliations:
-;-
Venue:
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Year:
1981

Citing 2
Cited 8

The parallel execution of DO loops

Communications of the ACM
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)

Principles of Compiler Design (Addison-Wesley series in computer science and information processing)

Very long instruction work architectures and the ELI-512

25 years of the international symposia on Computer architecture (selected papers)
The impact of synchronization and granularity on parallel systems

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Parallel processing: a smart compiler and a dumb machine

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Very Long Instruction Word architectures and the ELI-512

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
WaveScalar

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Parallel processing: a smart compiler and a dumb machine

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Toward type-oriented dynamic vertical migration

ACM SIGMICRO Newsletter
Evaluation of bus based interconnect mechanisms in clustered VLIW architectures

International Journal of Parallel Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable parallel instruction execution (SPIE machines). Several experiments have shown that within basic blocks, real code only gives a potential speed-up factor of 2 or 3 when compacted for SPIE machines, even in the presence of unlimited hardware. In this paper, similar experiments are described. However, these measure the potential parallelism available using any global compaction method, that is, one which compacts code beyond block boundaries. Global compaction is a subject of current investigation; no measurements yet exist on implemented systems. The approach taken is to first assume that an oracle is available during compaction. This oracle can resolve all dynamic considerations in advance, giving us the ability to find the maximum parallelism available without reformulation of the algorithm. The parallelism found is constrained only by legitimate data dependencies, since questions of conditional jump directions and unresolved indirect memory references are answered by the oracle. Using such an oracle, we find that typical scientific programs may be sped up by anywhere from 3 to 1000 times. These dramatic results provide an upper bound for global compaction techniques. We describe experiments in progress which attempt to limit the oracle progressively, with the aim of eventually producing one which provides only information that may be obtained by a very good compiler. This will give us a more practical measure of the parallelism potentially obtainable via global compaction methods.