Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
iWarp: anatomy of a parallel computing system
iWarp: anatomy of a parallel computing system
Introduction to Algorithms
IEEE Transactions on Parallel and Distributed Systems
StreamIt: A Language for Streaming Applications
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Computer
Quick Performance Models Quickly: Closely-Coupled Partitioned Simulation on FPGAs
ISPASS '08 Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software
Application-aware deadlock-free oblivious routing
Proceedings of the 36th annual international symposium on Computer architecture
Static virtual channel allocation in oblivious routing
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Journal of Systems Architecture: the EUROMICRO Journal
Analysis of application-aware on-chip routing under traffic uncertainty
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Hi-index | 0.00 |
Diastolic arrays are arrays of processing elements that communicate exclusively through First-In First-Out (FIFO) queues. FIFO virtualization units enable relaxed timing of data transfers, and include hardware support to guarantee bandwidth and buffer space for all data transfers, which may follow composite paths through the network. We show that the architecture of diastolic arrays enables efficient synthesis from high-level specifications of communicating finite state machines so average throughput is maximized. Preliminary results are presented on an H.264 decoding benchmark.