Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Dynamic dependency analysis of ordinary programs
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Disjoint eager execution: an optimal form of speculative execution
Proceedings of the 28th annual international symposium on Microarchitecture
Circuits for wide-window superscalar processors
Proceedings of the 27th annual international symposium on Computer architecture
Skipper: a microarchitecture for exploiting control-flow independence
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A design space evaluation of grid processor architectures
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The architecture of an optimistic CPU: the WarpEngine
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Control Speculation in Multithreaded Processors through Dynamic Loop Detection
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Supporting Highly-Speculative Execution via Adaptive Branch Trees
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Dynamic Hammock Predication for Non-Predicated Instruction Set Architectures
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
An efficient algorithm for exploiting multiple arithmetic units
IBM Journal of Research and Development
Realizing high IPC through a scalable memory-latency tolerant multipath microarchitecture
ACM SIGARCH Computer Architecture News
Hi-index | 0.02 |
In this paper we present a novel approach to exploiting ILP through the use of resource-flow computing. This model begins by executing instructions independent of data flow and control flow dependencies in a program. The rest of the execution time is spent applying programmatic data flow and control flow constraints to end up with a programmatically-correct execution. We present the design of a machine that uses time tags and Active Stations, realizing a registerless data path.In this contribution we focus our discussion on the Execution Window elements of our machine, present Instruction Per Cycle (IPC) speedups for SPECint95 and SPECint2000 programs, and discuss the scalability of our design to hundreds of processing elements.