Computing the throughput of probabilistic and replicated streaming applications

Authors:
Anne Benoit;Fanny Dufossé;Matthieu Gallet;Yves Robert;Bruno Gaujal
Affiliations:
ENS Lyon, Lyon, France;ENS Lyon, Lyon, France;ENS Lyon, Lyon, France;ENS Lyon, Lyon, France;INRIA, Grenoble, France
Venue:
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Year:
2010

Citing 13
Cited 0

Optimal mapping of sequences of data parallel tasks

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal latency-throughput tradeoffs for data parallel pipelines

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Optimizing execution of component-based applications using group instances

Future Generation Computer Systems - Best papers from symp. on cluster computing and the grid (CCGRID 2001)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Executing multiple pipelined data analysis operations in the grid

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A Heuristic Algorithm for Mapping Communicating Tasks on Heterogeneous Resources

HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming

Parallel Computing
Mapping pipeline skeletons onto heterogeneous platforms

Journal of Parallel and Distributed Computing
A Duplication Based Algorithm for Optimizing Latency Under Throughput Constraints for Streaming Workflows

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Supporting Distributed Application Workflows in Heterogeneous Computing Environments

ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
Computing the Throughput of Replicated Workflows on Heterogeneous Platforms

ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Toward optimizing latency under throughput constraints for application workflows on clusters

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we investigate how to compute the throughput of probabilistic and replicated streaming applications. We are given (i) a streaming application whose dependence graph is a linear chain; (ii) a one-to-many mapping of the application onto a fully heterogeneous target, where a processor is assigned at most one application stage, but where a stage can be replicated onto a set of processors; and (iii) a set of IID (Independent and Identically-Distributed) variables to model each computation and communication time in the mapping. How can we compute the throughput of the application, i.e., the rate at which data sets can be processed? We consider two execution models, the STRICT model where the actions of each processor are sequentialized, and the OVERLAP model where a processor can compute and communicate in parallel. The problem is easy when application stages are not replicated, i.e., assigned to a single processor: in that case the throughput is dictated by the critical hardware resource. However, when stages are replicated, i.e., assigned to several processors, the problem becomes surprisingly complicated: even in the deterministic case, the optimal throughput may be lower than the smallest internal resource throughput. To the best of our knowledge, the problem has never been considered in the probabilistic case. The first main contribution of the paper is to provide a general method (although of exponential cost) to compute the throughput when mapping parameters follow IID exponential laws. This general method is based upon the analysis of timed Petri nets deduced from the application mapping; it turns out that these Petri nets exhibit a regular structure in the OVERLAP model, thereby enabling to reduce the cost and provide a polynomial algorithm. The second main contribution of the paper is to provide bounds for the throughput when stage parameters are arbitrary IID and NBUE (New Better than Used in Expectation) variables: the throughput is bounded from below by the exponential case and bounded from above by the deterministic case.