Performance and energy consumption improvements in microprocessor systems utilizing a coprocessor data-path

Authors:
Michalis D. Galanis;Gregory Dimitroulakos;Costas E. Goutis
Affiliations:
VLSI Design Laboratory, ECE Department, University of Patras, Patras, Achaia, Greece;VLSI Design Laboratory, ECE Department, University of Patras, Patras, Achaia, Greece;VLSI Design Laboratory, ECE Department, University of Patras, Patras, Achaia, Greece
Venue:
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
Year:
2008

Citing 24
Cited 0

MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A benchmark suite for evaluating configurable computing systems—status, reflections, and future directions

FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications

IEEE Transactions on Computers
Math toolkit for real-time programming

Math toolkit for real-time programming
Synthesis and Optimization of Digital Circuits

Synthesis and Optimization of Digital Circuits
Storage Management Programmable Process

Storage Management Programmable Process
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators

Journal of VLSI Signal Processing Systems
Instruction generation for hybrid reconfigurable systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
The Garp Architecture and C Compiler

Computer
Energy Advantages of Microprocessor Platforms with On-Chip Configurable Logic

IEEE Design & Test
Synthesis of custom processors based on extensible platforms

Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Automatic application-specific instruction-set extensions under microarchitectural constraints

Proceedings of the 40th annual Design Automation Conference
A Processor-Coprocessor Architecture for High End Video Applications

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Application-specific instruction generation for configurable processor architectures

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Co-Processor Synthesis: A New Methodology for Embedded Software Acceleration

Proceedings of the conference on Design, automation and test in Europe - Volume 1
INSIDE: INstruction Selection/Identification & Design Exploration for Extensible Processors

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Implementing an OFDM Receiver on the RaPiD Reconfigurable Architecture

IEEE Transactions on Computers
Closing the power gap between ASIC and custom: an ASIC perspective

Proceedings of the 42nd annual Design Automation Conference
Hardware/software partitioning of software binaries: a case study of H.264 decode

CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Novel architecture for loop acceleration: a case study

CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Measuring the gap between FPGAs and ASICs

Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Performance optimization using template mapping for datapath-intensive high-level synthesis

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A high-performance data path for synthesizing DSP kernels

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The speedups and the energy reductions achieved in a generic single-chip microprocessor system by employing a high-performance data-path are presented. The data-path acts as a coprocessor that accelerates computational intensive kernel sections thereby increasing the overall performance. The authors have previously introduced the data-path which is composed by flexible computational components (FCCs). These components can realize any two-level sequence of primitive operations. The automated coprocessor synthesis method from high-level software description and its integration to a design flow for executing applications on the system is presented. The overall application speedups of eleven real-life applications, relative to the software execution on the microprocessor, are estimated using the design flow. These speedups are close to theoretical bounds and range from 1.78 to 5.84, having an average value of 3.04, while the overhead in circuit area is small. The energy savings range from 41 to 74%, while the reduction in the application energy-delay product has an average value of 80%. A comparison with another high-performance data-path showed that the proposed coprocessor achieves better performance, consumes less energy and has smaller area-time products for the generated data-paths. Additionally, the FCC data-path achieves better performance in accelerating kernels relative to a VLIW DSP core.