Partitioned register files for VLIWs: a preliminary analysis of tradeoffs
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Hierarchical Clustered Register File Organization for VLIW Processors
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
High-Performance and Low-Cost Dual-Thread VLIW Processor Using Weld Architecture Paradigm
IEEE Transactions on Parallel and Distributed Systems
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Reconfigurable Multithreading Architectures: A Survey
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Architectural support for multithreading on reconfigurable hardware
ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
Exploiting both pipelining and data parallelism with SIMD reconfigurable architecture
ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Multithreading on reconfigurable hardware: An architectural approach
Microprocessors & Microsystems
Hi-index | 0.00 |
The coarse-grained reconfigurable architecture ADRES (Architecture for Dynamically Reconfigurable Embedded Systems) and its compiler offer high instruction-level parallelism (ILP) to applications by means of a sparsely interconnected array of functional units and register files. As high-ILP architectures achieve only low parallelism when executing partially sequential code segments, which is also known as Amdahl's law, this paper proposes to extend ADRES to MT-ADRES (Multi-Threaded ADRES) to also exploit thread-level parallelism. On MT-ADRES architectures, the array can be partitioned in multiple smaller arrays that can execute threads in parallel. Because the partition can be changed dynamically, this extension provides more flexibility than a multi-core approach. This article presents details of the enhanced architecture and results obtained from an MPEG-2 decoder implementation that exploits a mix of thread-level parallelism and instruction-level parallelism.