A domain specific interconnect for reconfigurable computing

Authors:
Sanjay Rajopadhye;Gautam Gupta;Lakshminarayanan Renganarayana
Affiliations:
Colorado State University, Fort Collins, CO, USA;Colorado State University, Fort Collins, CO, USA;IBM Research, Yorktown Heights, NY, USA
Venue:
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Year:
2008

Citing 26
Cited 0

Interconnection networks for large-scale parallel processing: theory and case studies

Interconnection networks for large-scale parallel processing: theory and case studies
Vector models for data-parallel computing

Vector models for data-parallel computing
A practical algorithm for exact array dependence analysis

Communications of the ACM
NuMesh: an architecture optimized for scheduled communication

The Journal of Supercomputing - Special issue on parallel and distributed processing
An affine partitioning algorithm to maximize parallelism and minimize communication

ICS '99 Proceedings of the 13th international conference on Supercomputing
Compaan: deriving process networks from Matlab for embedded signal processing architectures

CODES '00 Proceedings of the eighth international workshop on Hardware/software codesign
Generation of Efficient Nested Loops from Polyhedra

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Optimizing memory usage in the polyhedral model

ACM Transactions on Programming Languages and Systems (TOPLAS)
A unified framework for schedule and storage optimization

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Route packets, not wires: on-chip inteconnection networks

Proceedings of the 38th annual Design Automation Conference
Blocking and array contraction across arbitrarily nested loops using affine partitioning

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Scheduling and Automatic Parallelization

Scheduling and Automatic Parallelization
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators

Journal of VLSI Signal Processing Systems
PipeRench: A Reconfigurable Architecture and Compiler

Computer
Optimizing Storage Size for Static Control Programs in Automatic Parallelizers

Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
On Synthesizing Systolic Arrays from Recurrence Equations with Linear Dependencies

Proceedings of the Sixth Conference on Foundations of Software Technology and Theoretical Computer Science
RaPiD - Reconfigurable Pipelined Datapath

FPL '96 Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers
Code generation for multiple mappings

FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
A Methodology for Designing Efficient On-Chip Interconnects on Well-Behaved Communication Patterns

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Multirate VLSI and their Synthesis

Multirate VLSI and their Synthesis
Interconnect Synthesis for Systems on Chip

IWSOC '04 Proceedings of the System-on-Chip for Real-Time Applications, 4th IEEE International Workshop
Code Generation in the Polyhedral Model Is Easier Than You Think

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
An architecture and compiler for scalable on-chip communication

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On-Chip Interconnects and Instruction Steering Schemes for Clustered Microarchitectures

IEEE Transactions on Parallel and Distributed Systems
Scalar Operand Networks

IEEE Transactions on Parallel and Distributed Systems
Lattice-Based Memory Allocation

IEEE Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Affine Control Loops (ACLs) occur frequently in data- and computeintensive applications. Implementing ACLs directly on dedicated hardware has the potential for spectacular performance improvement in area, time and energy. An important challenge for such direct hardware compilation of ACLs is the interconnection between the different processing elements, which may be non-local as well as dynamic. We propose a generic, reconfigurable interconnection fabric which can realize the data-path of any ACL and be dynamically reconfigured in constant time. We have applied for a patent for this technology.