Micro-optimization of floating-point operations
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Building and Using a Highly Parallel Programmable Logic Array
Computer - Special issue on experimental research in computer architecture
Data-parallel C on a reconfigurable logic array
The Journal of Supercomputing - Special issue on field programmable gate arrays
Programmable active memories: reconfigurable systems come of age
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Shasta: a low overhead, software-only approach for supporting fine-grain shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An integrated compile-time/run-time software distributed shared memory system
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
FPL '95 Proceedings of the 5th International Workshop on Field-Programmable Logic and Applications
NAPA C: Compiling for a Hybrid RISC/FPGA Architecture
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
Maps: a Compiler-Managed Memory System for RAW Machines
Maps: a Compiler-Managed Memory System for RAW Machines
Logic emulation with virtual wires
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Bidwidth analysis with application to silicon compilation
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Attacking the semantic gap between application programming languages and configurable hardware
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Precision and error analysis of MATLAB applications during automated hardware synthesis for FPGAs
Proceedings of the conference on Design, automation and test in Europe
Reconfigurable computing: a survey of systems and software
ACM Computing Surveys (CSUR)
A compiler approach to fast hardware design space exploration in FPGA-based systems
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
The architecture of the DIVA processing-in-memory chip
ICS '02 Proceedings of the 16th international conference on Supercomputing
Pointer analysis for structured parallel programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Reconfigurable Computing for Digital Signal Processing: A Survey
Journal of VLSI Signal Processing Systems
Synthesis of operation-centric hardware descriptions
Proceedings of the 2000 IEEE/ACM international conference on Computer-aided design
A system for synthesizing optimized FPGA hardware from MATLAB
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
On Availability of Bit-Narrow Operations in General-Purpose Applications
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Compilation Increasing the Scheduling Scope for Multi-memory-FPGA-Based Custom Computing Machines
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
Compiling Application-Specific Hardware
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
FlexCache: A Framework for Flexible Compiler Generated Data Caching
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Compilation for FPGA-Based Reconfigurable Hardware
IEEE Design & Test
Symbolic NFA scheduling of a RISC microprocessor
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Molecular electronics: devices, systems and tools for gigagate, gigabit chips
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Data communication estimation and reduction for reconfigurable systems
Proceedings of the 40th annual Design Automation Conference
Custom Data Layout for Memory Parallelism
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An algorithm for converting floating-point computations to fixed-point in MATLAB based FPGA design
Proceedings of the 41st annual Design Automation Conference
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
An architecture and compiler for scalable on-chip communication
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Evaluating heuristics in automatically mapping multi-loop applications to FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System
Journal of VLSI Signal Processing Systems
Bandwidth Management with a Reconfigurable Data Cache
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
An Algorithm for Trading Off Quantization Error with Hardware Resources for MATLAB-Based FPGA Design
IEEE Transactions on Computers
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Mapping streaming architectures on reconfigurable platforms
ACM SIGARCH Computer Architecture News - Special issue on the 2006 reconfigurable and adaptive architecture workshop
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Compiling for reconfigurable computing: A survey
ACM Computing Surveys (CSUR)
Bridging the gap between compilation and synthesis in the DEFACTO system
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Fast hardware compilation of behaviors into an FPGA-based dynamic reconfigurable computing system
SBCCI'99 Proceedings of the XIIth conference on Integrated circuits and systems design
Hi-index | 0.00 |
The next decade of computing will be dominated by embedded systems, information appliances and application-specific computers. In order to build these systems, designers will need high-level compilation and CAD tools that generate architectures that effectively meet the needs of each application. In this paper we present a novel compilation system that allows sequential programs, written in C or FORTRAN, to be compiled directly into custom silicon or reconfigurable architectures. This capability is also interesting because trends in computer architecture are moving towards more reconfigurable hardware-like substrates, such as FPGA based systems. Our system works by successfully combining two resource-efficient computing disciplines: Small Memories and Virtual Wires.For a given application, the compiler first analyzes the memory access patterns of pointers and arrays in the program and constructs a partitioned memory system made up of many small memories. The computation is implemented by active computing elements that are spatially distributed within the memory array. A space-time scheduler assigns instructions to the computing elements in a way that maximizes locality and minimizes physical communication distance. It also generates an efficient static schedule for the interconnect. Finally, specialized hardware for the resulting schedule of memory accesses, wires, and computation is generated as a multi-process state machine in synthesizable Verilog.With this system, implemented as a set of SUIF compiler passes, we have successfully compiled programs into hardware and achieve specialization performance enhancements by up to an order of magnitude versus a single general- purpose processor. We also achieve additional parallelization speedups similar to those obtainable using a tightly- interconnected multiprocessor.