Fast barrier synchronization hardware
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Distributed Hardwired Barrier Synchronization for Scalable Multiprocessor Clusters
IEEE Transactions on Parallel and Distributed Systems
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Exploiting fine-grain thread level parallelism on the MIT multi-ALU processor
Proceedings of the 25th annual international symposium on Computer architecture
Fast compilation for pipelined reconfigurable fabrics
FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A C compiler for a processor with a reconfigurable functional unit
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
The effect of reconfigurable units in superscalar processors
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
A stream compiler for communication-exposed architectures
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
The Garp Architecture and C Compiler
Computer
StreamIt: A Language for Streaming Applications
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Stream Computations Organized for Reconfigurable Execution (SCORE)
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Evaluation of a Multithreaded Architecture for Cellular Computing
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Picking Statistically Valid and Early Simulation Points
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Managing a Reconfigurable Processor in a General Purpose Workstation Environment
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Decoupled Software Pipelining with the Synchronization Array
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Automatic Thread Extraction with Decoupled Software Pipelining
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Support for High-Frequency Streaming in CMPs
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Speculative Decoupled Software Pipelining
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
A Reconfigurable Hardware Interface for a Modern Computing System
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Parallel-stage decoupled software pipelining
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Combining multicore and reconfigurable instruction set extensions
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
MiDataSets: creating the conditions for a more realistic evaluation of Iterative optimization
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Dynamically managed multithreaded reconfigurable architectures for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Shared reconfigurable fabric for multi-core customization
Proceedings of the 48th Design Automation Conference
Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Navigating big data with high-throughput, energy-efficient data partitioning
Proceedings of the 40th Annual International Symposium on Computer Architecture
Online scheduling for multi-core shared reconfigurable fabric
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
This paper presents ReMAP, a reconfigurable architecture geared towards accelerating and parallelizing applications within a heterogeneous CMP. In ReMAP, threads share a common reconfigurable fabric that can be configured for individual thread computation or fine-grained communication with integrated computation. The architecture supports both fine-grained point-to-point communication for pipeline parallelization and fine-grained barrier synchronization. The combination of communication and configurable computation within ReMAP provides the unique ability to perform customized computation while data is transferred between cores, and to execute custom global functions after barrier synchronization. ReMAP demonstrates significantly higher performance and energy efficiency compared to hard-wired communication-only mechanisms, and over what can ideally be achieved by allocating the fabric area to additional or more powerful cores.