A case study of hardware/software partitioning of traffic simulation on the Cray XD1

Authors:
Justin L. Tripp;Maya B. Gokhale;Anders Å. Hansson
Affiliations:
Los Alamos National Laboratory, Los Alamos, NM;Lawrence Livermore National Laboratory, Livermore, CA;Los Alamos National Laboratory, Los Alamos, NM
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2008

Citing 5
Cited 1

A parallel architecture for non-deterministic discrete event simulation

A parallel architecture for non-deterministic discrete event simulation
FPGA-Based Acceleration of the 3D Finite-Difference Time-Domain Method

FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Metropolitan Road Traffic Simulation on FPGAs

FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Performance benefits of monolithically stacked 3D-FPGA

Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
Trident: From High-Level Language to Hardware Circuitry

Computer

Parallel backprojection: a case study in high-performance reconfigurable computing

EURASIP Journal on Embedded Systems - FPGA supercomputing platforms, architectures, and techniques for accelerating computationally complex algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scientific application kernels mapped to reconfigurable hardware have been reported to have 10 × to 100 × speedup over equivalent software. These promising results suggest that reconfigurable logic might offer significant speedup on applications in science and engineering. To accurately assess the benefit of hardware acceleration on scientific applications, however, it is necessary to consider the entire application including software components as well as the accelerated kernels. Aspects to be considered include alternative methods of hardware/software partitioning, communications costs, and opportunities for concurrent computation between software and hardware. Analysis of these factors is beyond the scope of current automatic parallelizing compilers. In this paper, a case study is presented in which a simulation of metropolitan road traffic networks is mapped onto a reconfigurable supercomputer, the Cray XD1. Five different methods are presented for mapping the application onto the combined hardware/software system. An approach for approximating the performance of each method is derived through analytic equations. Our results, both analytically and empirically, show that key predictors of performance (which are often not considered in reported speedup of kernel operations) are not necessarily maximum parallelism, but must account for the fraction of the problem that runs on the reconfigurable logic and the amount data flow between software and hardware.