High-throughput bayesian computing machine with reconfigurable hardware

Authors:
Mingjie Lin;Ilia Lebedev;John Wawrzynek
Affiliations:
UC Berkeley, Berkeley, CA, USA;UC Berkeley, Berkeley, CA, USA;UC Berkeley, Berkeley, CA, USA
Venue:
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Year:
2010

Citing 11
Cited 2

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Computers and Intractability; A Guide to the Theory of NP-Completeness

Computers and Intractability; A Guide to the Theory of NP-Completeness
A Comparison of Heuristics for Scheduling DAGs on Multiprocessors

Proceedings of the 8th International Symposium on Parallel Processing
Logarithmic Number System and Floating-Point Arithmetics on FPGA

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Using Floating-Point Arithmetic on FPGAs to Accelerate Scientific N-Body Simulations

FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Building large-scale Bayesian networks

The Knowledge Engineering Review
Efficient Belief Propagation for Early Vision

International Journal of Computer Vision
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Reconfigurable computing for learning Bayesian networks

Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Efficient computation of sum-products on GPUs through software-managed cache

Proceedings of the 22nd annual international conference on Supercomputing
Learning bayesian network structure from massive datasets: the «sparse candidate« algorithm

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Bridging the GPGPU-FPGA efficiency gap

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Exploring many-core design templates for FPGAs and ASICs

International Journal of Reconfigurable Computing - Special issue on Selected Papers from the International Conference on Reconfigurable Computing and FPGAs (ReConFig'10)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use reconfigurable hardware to construct a high throughput Bayesian computing machine (BCM) capable of evalu- ating probabilistic networks with arbitrary DAG (directed acyclic graph) topology. Our BCM achieves high throughput by exploiting the FPGA's distributed memories and abundant hardware structures (such as long carry-chains and registers), which enables us to 1) develop an innovative memory allocation scheme based on a maximal matching algorithm that completely avoids memory stalls, 2) optimize and deeply pipeline the logic design of each processing node, and 3) optimally schedule them. The BCM architecture we present not only can be applied to many important algorithms in artificial intelligence, signal processing, and digital communications, but also has high reusability, i.e., a new application needs not change a BCM's hardware design, only new task graph processing and code compilation are necessary. Moreover, the throughput of a BCM scales almost linearly with the size of the FPGA on which it is implemented. A prototype of a Bayesian computing machine with 16 processing nodes was implemented with a Virtex-5 FPGA (XCV5LX155T-2) on a BEE3 (Berkeley Emulation Engine) platform. For a wide variety of sample Bayesian problems, comparing running the same network evaluation algorithm on a 2.4 GHz Core 2 Duo Intel processor and a GeForce 9400m using the CUDA software package, the BCM demonstrates 80x and 15x speedups respectively, with a peak throughput of 20.4 GFLOPS (Giga Floating-Point Operations per Second).