Parallel partitioning for distributed systems using sequential assignment

Authors:
Simon Spacey;Wayne Luk;Daniel Kuhn;Paul H. J. Kelly
Affiliations:
-;-;-;-
Venue:
Journal of Parallel and Distributed Computing
Year:
2013

Citing 36
Cited 0

Integer and combinatorial optimization

Integer and combinatorial optimization
P-Complete Approximation Problems

Journal of the ACM (JACM)
Heterogeneous parallel and distributed computing

Parallel Computing - Special Anniversary issue
Techniques for mapping tasks to machines in heterogeneous computing systems

Journal of Systems Architecture: the EUROMICRO Journal - Heterogeneous distributed and parallel architectures: hardware, software and design tools
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Link contention-constrained scheduling and mapping of tasks

Cluster Computing
Automatic Extraction of Functional Parallelism from Ordinary Programs

IEEE Transactions on Parallel and Distributed Systems
On Exploiting Heterogeneity for Cluster Based Parallel Multithreading Using Task Duplication

The Journal of Supercomputing
Hardware/software partitioning of software binaries

Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Protocol-Dependent Message-Passing Performance on Linux Clusters

CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
A Configurable Logic Architecture for Dynamic Hardware/Software Partitioning

Proceedings of the conference on Design, automation and test in Europe - Volume 1
A Decompilation Approach to Partitioning Software for Microprocessor/FPGA Platforms

Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
The Granularity Metric for Fine-Grain Real-Time Scheduling

IEEE Transactions on Computers
Multi-level placement for large-scale mixed-size IC designs

ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A semi-static approach to mapping dynamic iterative tasks onto heterogeneous computing systems

Journal of Parallel and Distributed Computing
On multiprocessor task scheduling using efficient state space search approaches

Journal of Parallel and Distributed Computing
Warp Processors

Proceedings of the 41st annual Design Automation Conference
On development of an efficient parallel loop self-scheduling for grid computing environments

Parallel Computing
A hybrid Branch-and-Bound and evolutionary approach for allocating strings of applications to heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Combining building blocks for parallel multi-level matrix multiplication

Parallel Computing
An algorithm for the generalized quadratic assignment problem

Computational Optimization and Applications
Static heuristics for robust resource allocation of continuously executing applications

Journal of Parallel and Distributed Computing
Parallel Computing Experiences with CUDA

IEEE Micro
Hierarchical Scheduling Framework for Virtual Clustering of Multiprocessors

ECRTS '08 Proceedings of the 2008 Euromicro Conference on Real-Time Systems
Axel: a heterogeneous cluster with FPGAs and GPUs

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Resource allocation algorithms for virtualized service hosting platforms

Journal of Parallel and Distributed Computing
Reconfiguration and Communication-Aware Task Scheduling for High-Performance Reconfigurable Computing

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Coarse-grained loop parallelization: Iteration Space Slicing vs affine transformations

Parallel Computing
Reformulations in mathematical programming: automatic symmetry detection and exploitation

Mathematical Programming: Series A and B
Scheduling for heterogeneous Systems using constrained critical paths

Parallel Computing
High performance computing using MPI and OpenMP on multi-core parallel systems

Parallel Computing
CHIPS: Custom Hardware Instruction Processor Synthesis

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Fine-grain parallelism using multi-core, Cell/BE, and GPU Systems

Parallel Computing
Robust Software Partitioning with Multiple Instantiation

INFORMS Journal on Computing
Improving communication latency with the write-only architecture

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a method to combine the advantages of both task parallelism and fine-grained co-design specialisation to achieve faster execution times than either method alone on distributed heterogeneous architectures. The method uses a novel mixed integer linear programming formalisation to assign code sections from parallel tasks to share computational components with the optimal trade-off between acceleration from component specialism and serialisation delay. The paper provides results for software benchmarks partitioned using the method and formal implementations of previous alternatives to demonstrate both the practical tractability of the linear programming approach and the increase in program acceleration potential deliverable.