Parallelization of module network structure learning and performance tuning on SMP

  • Authors:
  • Hongshan Jiang;Chunrong Lai;Wenguang Chen;Yurong Chen;Wei Hu;Weimin Zheng;Yimin Zhang

  • Affiliations:
  • Tsinghua University, Dept. of Computer Science, Beijing, China;Intel China Research Center Ltd., Beijing, China;Tsinghua University, Dept. of Computer Science, Beijing, China;Intel China Research Center Ltd., Beijing, China;Intel China Research Center Ltd., Beijing, China;Tsinghua University, Dept. of Computer Science, Beijing, China;Intel China Research Center Ltd., Beijing, China

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a timeconsuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between loadbalance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.