Analytical Modeling of Pipeline Parallelism

Authors:
Angeles Navarro;Rafael Asenjo;Siham Tabik;Calin Cascaval
Affiliations:
-;-;-;-
Venue:
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Year:
2009

Citing 0
Cited 13

Feedback-directed pipeline parallelism

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Partitioning streaming parallelism for multi-cores: a machine learning based approach

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Skewed pipelining for parallel simulink simulations

Proceedings of the Conference on Design, Automation and Test in Europe
Parallelism orchestration using DoPE: the degree of parallelism executive

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Expressing pipeline parallelism using TBB constructs: a case study on what works and what doesn't

Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Work stealing strategies for parallel stream processing in soft real-time systems

ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
A template library to integrate thread scheduling and locality management for NUMA multiprocessors

HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Pipelining for cyclic control systems

Proceedings of the 16th international conference on Hybrid systems: computation and control
On-the-fly pipeline parallelism

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Deterministic scale-free pipeline parallelism with hyperqueues

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Load-balanced pipeline parallelism

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Using machine learning to partition streaming programs

ACM Transactions on Architecture and Code Optimization (TACO)
PAIS: Parallelism-aware interconnect scheduling in multicores

ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallel programming is a requirement in the multi-core era. One of the most promising techniques to make parallel programming available for the general users is the use of parallel programming patterns. Functional pipeline parallelism is a pattern that is well suited for many emerging applications, such as streaming and "Recognition, Mining and Synthesis" (RMS) workloads. In this paper we develop an analytical model for pipeline parallelism based on queueing theory. The model is useful to both characterize the performance and efficiency of existing implementations and to guide the design of new pipeline algorithms. We demonstrate the usefulness of the model by characterizing and optimizing two of the PARSEC benchmarks, ferret and dedup. We identified two issues with these codes: load imbalance and I/O bottlenecks. We addressed load imbalance using two techniques: i) parallel pipeline stage collapsing; and ii) dynamic scheduling. We implemented these optimizations using Pthreads and the Threading Building Blocks (TBB) libraries. We compare the performance of different alternatives and we note that the TBB implementation based on work stealing outperforms all other variants.