VLSI array processors
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Journal of Parallel and Distributed Computing - Special issue: data-flow processing
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The function processor: an architecture for efficient execution of recursive functions
PARLE '91 Proceedings on Parallel architectures and languages Europe : volume I: parallel architectures and algorithms: volume I: parallel architectures and algorithms
Detecting static algorithms by partial evaluation
PEPM '91 Proceedings of the 1991 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
Automatic online partial evaluation
Proceedings of the 5th ACM conference on Functional programming languages and computer architecture
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The Function Processor: a data-driven processor array for irregular computations
Future Generation Computer Systems - Special issue: PARLE 91
An introduction to partial evaluation
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
It has been shown that control flow decisions often reduce the amount of fine-grain, or instruction-level parallelism in many computations. There exists various architectural solutions to this problem, such as branch prediction and speculative execution. Here we present an alternative solution, based on the use of partial evaluation for statically increasing the parallelism of a computation. In particular we explore the capability of a partial evaluator to remove control flow decisions.A partial evaluator is described, which specializes data flow graphs produced by a compiler for a fine-grained processor array. The data flow graphs can be viewed as the equivalent of a machine-code program for conventional processor, and consequently some aspects of the architecture can be taken into account by the partial evaluator.Results from the use of the partial evaluator show that it can significantly improve the performance of a computation. The performance improvement is partly due to improved parallelism, but mostly due to a reduce dynamic instruction count, i.e. a reduction in the number of operations executed.