Communications of the ACM
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A high-performance microarchitecture with hardware-programmable functional units
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A bandwidth-efficient architecture for media processing
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
HSRA: high-speed, hierarchical synchronous reconfigurable array
FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
Co-Synthesis to a Hybrid RISC/FPGA Architecture
Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
Adapting software pipelining for reconfigurable computing
CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
Path analysis and renaming for predicted instruction scheduling
International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques
Dynamic power consumption in Virtex™-II FPGA family
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
An FPGA for Implementing Asynchronous Circuits
IEEE Design & Test
Compiling Application-Specific Hardware
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
FPL '95 Proceedings of the 5th International Workshop on Field-Programmable Logic and Applications
Subthreshold leakage modeling and reduction techniques
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Peer-to-Peer Hardware-Software Interfaces for Reconfigurable Fabrics
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
High-Level Power Modeling of CPLDs and FPGAs
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams
Proceedings of the 31st annual international symposium on Computer architecture
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
An Asynchronous Dataflow FPGA Architecture
IEEE Transactions on Computers
Robust interfaces for mixed-timing systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Scheduling Discipline for Latency and Bandwidth Guarantees in Asynchronous Network-on-Chip
ASYNC '05 Proceedings of the 11th IEEE International Symposium on Asynchronous Circuits and Systems
FPGA Architecture for Multi-Style Asynchronous Logic
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
SOMA: a tool for synthesizing and optimizing memory accesses in ASICs
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Measuring the gap between FPGAs and ASICs
Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
The impact of the nanoscale on computing systems
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Dataflow: A Complement to Superscalar
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Accelerating Speculative Execution in High-Level Synthesis with Cancel Tokens
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Evolution in architectures and programming methodologies of coarse-grained reconfigurable computing
Microprocessors & Microsystems
Performance and power of cache-based reconfigurable computing
Proceedings of the 36th annual international symposium on Computer architecture
Conservation cores: reducing the energy of mature computations
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
A general constraint-centric scheduling framework for spatial architectures
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
Spatial Computing (SC) has been shown to be an energy-efficient model for implementing program kernels. In this paper we explore the feasibility of using SC for more than small kernels. To this end, we evaluate the performance and energy efficiency of entire applications on Tartan, a general-purpose architecture which integrates a reconfigurable fabric (RF) with a superscalar core. Our compiler automatically partitions and compiles an application into an instruction stream for the core and a configuration for the RF. We use a detailed simulator to capture both timing and energy numbers for all parts of the system.Our results indicate that a hierarchical RF architecture, designed around a scalable interconnect, is instrumental in harnessing the benefits of spatial computation. The interconnect uses static configuration and routing at the lower levels and a packet-switched, dynamically-routed network at the top level. Tartan is most energyefficient when almost all of the application is mapped to the RF, indicating the need for the RF to support most general-purpose programming constructs. Our initial investigation reveals that such a system can provide, on average, an order of magnitude improvement in energy-delay compared to an aggressive superscalar core on single-threaded workloads.