Computer organization and design (2nd ed.): the hardware/software interface
Computer organization and design (2nd ed.): the hardware/software interface
TTAs: missing the ILP complexity wall
Journal of Systems Architecture: the EUROMICRO Journal - Special double issue on microprocessor architecture
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP
ACM Transactions on Architecture and Code Optimization (TACO)
FITS: framework-based instruction-set tuning synthesis for embedded application specific processors
Proceedings of the 41st annual Design Automation Conference
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams
Proceedings of the 31st annual international symposium on Computer architecture
The Impact of Performance Asymmetry in Emerging Multicore Architectures
Proceedings of the 32nd annual international symposium on Computer Architecture
A cycle-accurate compilation algorithm for custom pipelined datapaths
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Designing a custom architecture for DCT using NISC technology
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
High-quality ISA synthesis for low-power cache designs in embedded microprocessors
IBM Journal of Research and Development
A Flexible Datapath Interconnect for Embedded Applications
ISVLSI '07 Proceedings of the IEEE Computer Society Annual Symposium on VLSI
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
A Flexible Code Compression Scheme Using Partitioned Look-Up Tables
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Squashing microcode stores to size in embedded systems while delivering rapid microcode accesses
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
IEEE Transactions on Circuits and Systems Part I: Regular Papers - Special section on 2009 IEEE system-on-chip conference
Improving processor efficiency by statically pipelining instructions
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Hi-index | 0.00 |
We introduce FlexCore, the first exemplar of an architecture based on the FlexSoC framework. Comprising the same datapath units found in a conventional five-stage pipeline, the FlexCore has an exposed datapath control and a flexible interconnect to allow the datapath to be dynamically reconfigured as a consequence of code generation. Additionally, the FlexCore allows specialized datapath units to be inserted and utilized within the same architecture and compilation framework. This study shows that, in comparison to a conventional five-stage general-purpose processor, the FlexCore is up to 40% more efficient in terms of cycle count on a set of benchmarks from the embedded application domain. We show that both the fine-grained control and the flexible interconnect contribute to the speedup. Furthermore, according to our VLSI implementation study, the FlexCore architecture offers both time and energy savings. The exposed FlexCore datapath requires a wide control word. The conducted evaluation confirms that this increases the instruction bandwidth and memory footprint. This calls for efficient instruction decoding as proposed in the FlexSoC framework.