A fast routability-driven router for FPGAs
FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Trading quality for compile time: ultra-fast placement for FPGAs
FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
A C compiler for a processor with a reconfigurable functional unit
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
Tolerating operational faults in cluster-based FPGAs
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
IEEE Transactions on Computers
Design and Implementation of the MorphoSys Reconfigurable ComputingProcessor
Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
Attacking the semantic gap between application programming languages and configurable hardware
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
Compiler Support for Scalable and Efficient Memory Systems
IEEE Transactions on Computers
Fast placement approaches for FPGAs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Balancing Logic Utilization and Area Efficiency in FPGAs
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Flexible ATE Module with Reconfigurable Circuit and Its Application
ITC '99 Proceedings of the 1999 IEEE International Test Conference
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams
Proceedings of the 31st annual international symposium on Computer architecture
Efficient architectures through application clustering and architectural heterogeneity
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary
ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
Dynamic coprocessor management for FPGA-enhanced compute platforms
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
ACM Transactions on Architecture and Code Optimization (TACO)
Progress in autonomous fault recovery of field programmable gate arrays
ACM Computing Surveys (CSUR)
Kismet: parallel speedup estimates for serial programs
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
FMRPU: design of fine-grain multi-context reconfigurable processing unit
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Fine grain thread scheduling on multicore processors: cores with multiple functional units
Proceedings of the 6th ACM India Computing Convention
Hi-index | 0.01 |
The RAW benchmark suite consists of twelve programs designed to facilitate comparing, validating, and improving reconfigurable computing systems. These benchmarks run the gamut of algorithms found in general purpose computing, including sorting, matrix operations, and graph algorithms. The suite includes an architecture-independent compilation framework, Raw Computation Structures (RawCS), to express each algorithm's dependencies and to support automatic synthesis, partitioning, and mapping to a reconfigurable computer. Within this framework, each benchmark is portably designed in both C and Behavioral Verilog and scalably parameterized to consume a range of hardware resource capacities. To establish initial benchmark ratings, we have targeted a commercial logic emulation system based on virtual wires technology to automatically generate designs up to millions of gates (14 to 379 FPGAs). Because the virtual wires techniques abstract away machine-level details like FPGA capacity and interconnect, our hardware target for this system is an abstract reconfigurable logic fabric with memory-mapped host I/O. We report initial speeds in the range of 2X to 1800X faster than a 2.82 SPECint95 SparcStation 20 and encourage others in the field to run these benchmarks on other systems to provide a standard comparison.