Design space exploration using arithmetic-level hardware--software cosimulation for configurable multiprocessor platforms

Authors:
Jingzhao Ou;Viktor K. Prasanna
Affiliations:
Xilinx, Inc., San Jose, CA;University of Southern California, Los Angeles, CA
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2006

Citing 11
Cited 1

An investigation of scalable SIMD I/O techniques with application to parallel JPEG compression

Journal of Parallel and Distributed Computing - Special issue on multimedia processing and technology
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Hardware-software co-design of embedded systems: the POLIS approach

Hardware-software co-design of embedded systems: the POLIS approach
A survey of CORDIC algorithms for FPGA based computers

FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
Engineering of Reconfigurable Hardware/Software Objects

The Journal of Supercomputing
Compiler Optimizations for Adaptive EPIC Processors

EMSOFT '01 Proceedings of the First International Workshop on Embedded Software
Hardware/Software Co-Design for Data-Driven Xputer-based Accelerators

VLSID '97 Proceedings of the Tenth International Conference on VLSI Design: VLSI in Multimedia Applications
PyGen: A MATLAB/Simulink Based Tool for Synthesizing Parameterized and Energy Efficient Designs Using FPGAs

FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A Single Program Multiple Data Parallel Processing Platform for FPGAs

FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Instruction set extension with shadow registers for configurable processors

Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
An FPGA-based VLIW processor with custom hardware execution

Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays

Simulink®-based heterogeneous multiprocessor SoC design flow for mixed hardware/software refinement and simulation

Integration, the VLSI Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Configurable multiprocessor platforms consist of multiple soft processors configured on FPGA devices. They have become an attractive choice for implementing many computing applications. In addition to the various ways of distributing software execution among the multiple soft processors, the application designer can customize soft processors and the connections between them in order to improve the performance of the applications running on the multiprocessor platform. State-of-the-art design tools rely on low-level simulation to explore the various design trade-offs offered by configurable multiprocessor platforms. These low-level simulation based exploration techniques are too time-consuming and can be a major bottleneck to efficient design space exploration on these platforms. We propose a design space exploration technique for configurable multiprocessor platforms using arithmetic-level cycle-accurate hardware--software cosimulation. Arithmetic-level abstractions of the hardware and software execution platforms are created within the proposed cosimulation environment. The configurable multiprocessor platforms are described using these arithmetic-level abstractions. Hardware and software simulators are tightly integrated to concurrently simulate the arithmetic behavior of the multiprocessor platform. The simulation within the integrated simulators are synchronized to provide cycle-accurate simulation results for the complete multiprocessor platform. By doing so, we significantly speed up the cosimulation process for configurable multiprocessor platforms. Exploration of the various hardware-software design trade-offs provided by configurable multiprocessor platforms can be performed within the proposed cycle-accurate cosimulation environment. After the final designs are identified, the corresponding low-level implementations with the desired cycle-accurate arithmetic behavior are generated automatically. For illustrative purposes, we provide an implementation of our approach based on MATLAB/Simulink. We show the cosimulation of two numerical computation applications and one image-processing application on a popular configurable multiprocessor platform within the MATLAB/Simulink-based cosimulation environment. For these three applications, our arithmetic-level cosimulation approach leads to speed-ups in simulation time of up to more than 800x compared with the low-level simulation approaches. The designs of these applications identified using our arithmetic-level cosimulation approach achieve execution time speed-ups up to 5.6x, compared with other designs considered in our experiments.