Communications of the ACM - Special issue on parallelism
A parallel Lauritzen-Spiegelhalter algorithm for probabilistic inference
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Scalable Parallel Implementation of Exact Inference in Bayesian Networks
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Efficient computation of sum-products on GPUs through software-managed cache
Proceedings of the 22nd annual international conference on Supercomputing
High-throughput Bayesian network learning using heterogeneous multicore computers
Proceedings of the 24th ACM International Conference on Supercomputing
Understanding the scalability of Bayesian network inference using clique tree growth curves
Artificial Intelligence
Parallel Exact Inference on a CPU-GPGPU Heterogenous System
ICPP '10 Proceedings of the 2010 39th International Conference on Parallel Processing
Scaling up Machine Learning: Parallel and Distributed Approaches
Scaling up Machine Learning: Parallel and Distributed Approaches
Accelerating Bayesian network parameter learning using Hadoop and MapReduce
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Hi-index | 0.00 |
The junction tree approach, with applications in artificial intelligence, computer vision, machine learning, and statistics, is often used for computing posterior distributions in probabilistic graphical models. One of the key challenges associated with junction trees is computational, and several parallel computing technologies - including many-core processors - have been investigated to meet this challenge. Many-core processors (including GPUs) are now programmable, unfortunately their complexities make it hard to manually tune their parameters in order to optimize software performance. In this paper, we investigate a machine learning approach to minimize the execution time of parallel junction tree algorithms implemented on a GPU. By carefully allocating a GPU's threads to different parallel computing opportunities in a junction tree, and treating this thread allocation problem as a machine learning problem, we find in experiments that regression - specifically support vector regression - can substantially outperform manual optimization.