Interconnection networks for large-scale parallel processing: theory and case studies
Interconnection networks for large-scale parallel processing: theory and case studies
Vector models for data-parallel computing
Vector models for data-parallel computing
A practical algorithm for exact array dependence analysis
Communications of the ACM
NuMesh: an architecture optimized for scheduled communication
The Journal of Supercomputing - Special issue on parallel and distributed processing
An affine partitioning algorithm to maximize parallelism and minimize communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
Compaan: deriving process networks from Matlab for embedded signal processing architectures
CODES '00 Proceedings of the eighth international workshop on Hardware/software codesign
Generation of Efficient Nested Loops from Polyhedra
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Optimizing memory usage in the polyhedral model
ACM Transactions on Programming Languages and Systems (TOPLAS)
A unified framework for schedule and storage optimization
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
Blocking and array contraction across arbitrarily nested loops using affine partitioning
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Scheduling and Automatic Parallelization
Scheduling and Automatic Parallelization
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators
Journal of VLSI Signal Processing Systems
Optimizing Storage Size for Static Control Programs in Automatic Parallelizers
Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
On Synthesizing Systolic Arrays from Recurrence Equations with Linear Dependencies
Proceedings of the Sixth Conference on Foundations of Software Technology and Theoretical Computer Science
RaPiD - Reconfigurable Pipelined Datapath
FPL '96 Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers
Code generation for multiple mappings
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
A Methodology for Designing Efficient On-Chip Interconnects on Well-Behaved Communication Patterns
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Multirate VLSI and their Synthesis
Multirate VLSI and their Synthesis
Interconnect Synthesis for Systems on Chip
IWSOC '04 Proceedings of the System-on-Chip for Real-Time Applications, 4th IEEE International Workshop
Code Generation in the Polyhedral Model Is Easier Than You Think
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
An architecture and compiler for scalable on-chip communication
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On-Chip Interconnects and Instruction Steering Schemes for Clustered Microarchitectures
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Lattice-Based Memory Allocation
IEEE Transactions on Computers
Hi-index | 0.00 |
Affine Control Loops (ACLs) occur frequently in data- and computeintensive applications. Implementing ACLs directly on dedicated hardware has the potential for spectacular performance improvement in area, time and energy. An important challenge for such direct hardware compilation of ACLs is the interconnection between the different processing elements, which may be non-local as well as dynamic. We propose a generic, reconfigurable interconnection fabric which can realize the data-path of any ACL and be dynamically reconfigured in constant time. We have applied for a patent for this technology.