Efficient design space exploration in PICO
CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
Functional abstraction driven design space exploration of heterogeneous programmable architectures
Proceedings of the 14th international symposium on Systems synthesis
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
System-level exploration for pareto-optimal configurations in parameterized systems-on-a-chip
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
A system-level methodology for fast multi-objective design space exploration
Proceedings of the 13th ACM Great Lakes symposium on VLSI
Multi-objective design space exploration using genetic algorithms
Proceedings of the tenth international symposium on Hardware/software codesign
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Time-Energy Design Space Exploration for Multi-Layer Memory Architectures
Proceedings of the conference on Design, automation and test in Europe - Volume 1
A self-tuning cache architecture for embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Balancing design options with Sherpa
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Application-specific customization of soft processor microarchitecture
Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
Platune: a tuning framework for system-on-a-chip platforms
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
An FPGA-based Pentium® in a complete desktop system
Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
Interactive presentation: Soft-core processor customization using the design of experiments paradigm
Proceedings of the conference on Design, automation and test in Europe
A Desktop Computer with a Reconfigurable Pentium®
ACM Transactions on Reconfigurable Technology and Systems (TRETS) - Special edition on the 15th international symposium on FPGAs
Dynamic tuning of configurable architectures: the AWW online algorithm
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
ACM Transactions on Architecture and Code Optimization (TACO)
A pipeline interleaved heterogeneous SIMD soft processor array architecture for MIMO-OFDM detection
ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
Near-Optimal Microprocessor and Accelerators Codesign with Latency and Throughput Constraints
ACM Transactions on Architecture and Code Optimization (TACO)
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
Soft-core microprocessors mapped onto field-programmable gate arrays (FPGAs) represent an increasingly common embedded software implementation option. Modern FPGA soft-cores are parameterized to support application-specific customization, wherein pre-defined units, such as a multiplication unit or floating-point unit, may be included in the microprocessor architecture to speed up software execution at the expense of increased size. We introduce a methodology for fast applicationspecific customization of a parameterized FPGA soft core, using synthesis and execution to obtain size and performance data in order to create a tool that can be used across a variety of tool platforms and FPGA devices. As synthesizing a soft core takes tens of minutes, developing heuristics that execute in an acceptable time of an hour or two, yet find near-optimal results, is a challenge. We consider two approaches, one using a traditional CAD approach that does an initial characterization using synthesis to create an abstract problem model and then explores the solution space using a knapsack algorithm, and the other using a synthesisin-the-loop exploration approach. We compare approaches for a variety of design constraints, on 11 EEMBC benchmarks, using an actual Xilinx soft-core processor, and for two different commercial Xilinx FPGA devices. Our results show that the approaches can generate a customized configuration exhibiting roughly 2x speedups over a base soft core, reaching within 4% of optimal in about 1.5 hours, including complete synthesis of the soft-core onto the FPGA, compared to over 11 hours for exhaustive search. Our results also show that including synthesisin-the-loop, compared to a traditional CAD approach, improved speedups by an average of 20% when size constraints were tight. The approaches may also be applicable to soft-core processors targeted to ASICs in addition to FPGAs.