ACM Computing Surveys (CSUR)
Resource requirements of dataflow programs
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Communications of the ACM
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
The RC compiler for the DTN dataflow computer
Journal of Parallel and Distributed Computing - Special issue: data-flow processing
Dependence flow graphs: an algebraic approach to program dependencies
POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The C programming language
Programming in VLSI: from communicating processes to delay-insensitive circuits
Developments in concurrency and communication
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Efficient accommodation of may-alias information in SSA form
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Handshake circuits: an asynchronous architecture for VLSI programming
Handshake circuits: an asynchronous architecture for VLSI programming
A high-performance microarchitecture with hardware-programmable functional units
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Sparse functional stores for imperative programs
IR '95 Papers from the 1995 ACM SIGPLAN workshop on Intermediate representations
ACM Computing Surveys (CSUR)
How much non-strictness do lenient programs require?
FPCA '95 Proceedings of the seventh international conference on Functional programming languages and computer architecture
A comparison of full and partial predicated execution support for ILP processors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Register promotion in C programs
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
An efficient implementation of reactivity for modeling hardware in the scenic design environment
DAC '97 Proceedings of the 34th annual Design Automation Conference
A framework for balancing control flow and predication
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
A programming environment for the design of complex high speed ASICs
DAC '98 Proceedings of the 35th annual Design Automation Conference
Register promotion by sparse partial redundancy elimination of loads and stores
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
ACM SIGPLAN Notices
A bandwidth-efficient architecture for media processing
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Hardware synthesis from C/C++ models
DATE '99 Proceedings of the conference on Design, automation and test in Europe
DATE '99 Proceedings of the conference on Design, automation and test in Europe
DATE '99 Proceedings of the conference on Design, automation and test in Europe
ECL: a specification environment for system-level design
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
The design of a low energy FPGA
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Hardware-software co-design of embedded reconfigurable architectures
Proceedings of the 37th Annual Design Automation Conference
The role of custom design in ASIC Chips
Proceedings of the 37th Annual Design Automation Conference
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
An open graph visualization system and its applications to software engineering
Software—Practice & Experience - Special issue on discrete algorithm engineering
Attacking the semantic gap between application programming languages and configurable hardware
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
An automated process for compiling dataflow graphs into reconfigurable hardware
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Speculation techniques for high level synthesis of control intensive designs
Proceedings of the 38th annual Design Automation Conference
NanoFabrics: spatial computing using molecular electronics
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Synthesis of hardware models in C with pointers and complex data structures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - System Level Design
Resynthesis and peephole transformations for the optimization of large-scale asynchronous systems
Proceedings of the 39th annual Design Automation Conference
Coordinated transformations for high-level synthesis of high performance microprocessor blocks
Proceedings of the 39th annual Design Automation Conference
Slack: maximizing performance under technological constraints
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators
Journal of VLSI Signal Processing Systems
Synthesis of operation-centric hardware descriptions
Proceedings of the 2000 IEEE/ACM international conference on Computer-aided design
Path Analysis and Renaming for Predicated Instruction Scheduling
International Journal of Parallel Programming
First version of a data flow procedure language
Programming Symposium, Proceedings Colloque sur la Programmation
Extended SSA Numbering: Introducing SSA Properties to Language with Multi-level Pointers
CC '98 Proceedings of the 7th International Conference on Compiler Construction
Instruction-Level Parallelism for Reconfigurable Computing
FPL '98 Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm
XPP-VC: A C Compiler with Temporal Partitioning for the PACT-XPP Architecture
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Compiling Application-Specific Hardware
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Automatic Synthesis of Parallel Programs Targeted to Dynamically Reconfigurable Logic Arrays
FPL '95 Proceedings of the 5th International Workshop on Field-Programmable Logic and Applications
Very Large Scale Spatial Computing
UMC '02 Proceedings of the Third International Conference on Unconventional Models of Computation
Effective Representation of Aliases and Indirect Memory Operations in SSA Form
CC '96 Proceedings of the 6th International Conference on Compiler Construction
An efficient static analysis algorithm to detect redundant memory operations
Proceedings of the 2002 workshop on Memory system performance
Implications of technology scaling on leakage reduction techniques
Proceedings of the 40th annual Design Automation Conference
Optimizing memory accesses for spatial computation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
The Lutonium: A Sub-Nanojoule Asynchronous 8051 Microcontroller
ASYNC '03 Proceedings of the 9th International Symposium on Asynchronous Circuits and Systems
Mapping applications to the RaPiD configurable architecture
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
Implementing C Algorithms in Reconfigurable Hardware Using C2Verilog
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
Parallelizing Applications into Silicon
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A C to HDL Compiler for Pipeline Processing on FPGAs
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Stream-Oriented FPGA Computing in the Streams-C High Level Language
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Peer-to-Peer Hardware-Software Interfaces for Reconfigurable Fabrics
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Predicated Static Single Assignment
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Implementation and Evaluation of the Compiler for WASMII, a Virtual Hardware System
ICPP '99 Proceedings of the 1999 International Workshops on Parallel Processing
A dynamic instruction set computer
FCCM '95 Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines
ACM SIGPLAN Notices
Spatial computation
Bridging the gap between compilation and synthesis in the DEFACTO system
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
C-based SoC design flow and EDA tools: an ASIC and system vendor perspective
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Low-power circuits using dynamic threshold devices
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Dynamic loop pipelining in data-driven architectures
Proceedings of the 2nd conference on Computing frontiers
Bio Molecular Engine: a bio-inspired environment for models of growing and evolvable computation
GECCO '05 Proceedings of the 7th annual workshop on Genetic and evolutionary computation
Compiling for EDGE Architectures
Proceedings of the International Symposium on Code Generation and Optimization
The impact of the nanoscale on computing systems
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Reducing control overhead in dataflow architectures
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Tartan: evaluating spatial computation for whole program execution
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Distributed Microarchitectural Protocols in the TRIPS Prototype Processor
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ACM Transactions on Computer Systems (TOCS)
Global critical path: a tool for system-level timing analysis
Proceedings of the 44th annual Design Automation Conference
Self-resetting latches for asynchronous micro-pipelines
Proceedings of the 44th annual Design Automation Conference
Operation chaining asynchronous pipelined circuits
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Slack analysis in the system design loop
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Impact of high-level transformations within the ROCCC framework
ACM Transactions on Architecture and Code Optimization (TACO)
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Behavioral synthesis of asynchronous circuits using syntax directed translation as backend
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Microprocessors & Microsystems
Bundled execution of recurring traces for energy-efficient general purpose processing
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Exploring many-core design templates for FPGAs and ASICs
International Journal of Reconfigurable Computing - Special issue on Selected Papers from the International Conference on Reconfigurable Computing and FPGAs (ReConFig'10)
Exposing ILP in custom hardware with a dataflow compiler IR
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
This paper describes a computer architecture, Spatial Computation (SC), which is based on the translation of high-level language programs directly into hardware structures. SC program implementations are completely distributed, with no centralized control. SC circuits are optimized for wires at the expense of computation units.In this paper we investigate a particular implementation of SC: ASH (Application-Specific Hardware). Under the assumption that computation is cheaper than communication, ASH replicates computation units to simplify interconnect, building a system which uses very simple, completely dedicated communication channels. As a consequence, communication on the datapath never requires arbitration; the only arbitration required is for accessing memory. ASH relies on very simple hardware primitives, using no associative structures, no multiported register files, no scheduling logic, no broadcast, and no clocks. As a consequence, ASH hardware is fast and extremely power efficient.In this work we demonstrate three features of ASH: (1) that such architectures can be built by automatic compilation of C programs; (2) that distributed computation is in some respects fundamentally different from monolithic superscalar processors; and (3) that ASIC implementations of ASH use three orders of magnitude less energy compared to high-end superscalar processors, while being on average only 33% slower in performance (3.5x worst-case).