Trebuchet: exploring TLP with dataflow virtualisation

Authors:
Tiago A. O. Alves;Leandro A. J. Marzulo;Felipe M. G. Franca;Vitor Santos Costa
Affiliations:
Programa de Engenharia de Sistemas e Computacao, COPPE, Universidade Federal do Rio de Janeiro, Cidade Universitaria, Centro de Tecnologia, Bloco H, Sala 319, Rio de Janeiro, RJ, 21941-972, Br ...;Programa de Engenharia de Sistemas e Computacao, COPPE, Universidade Federal do Rio de Janeiro, Cidade Universitaria, Centro de Tecnologia, Bloco H, Sala 319, Rio de Janeiro, RJ, 21941-972, Br ...;Programa de Engenharia de Sistemas e Computacao, COPPE, Universidade Federal do Rio de Janeiro, Cidade Universitaria, Centro de Tecnologia, Bloco H, Sala 319, Rio de Janeiro, RJ, 21941-972, Br ...;CRACS and INESC-Porto LA, Faculdade de Ciencias, Universidade do Porto, Rua do Campo Alegre, 1021, 4169-007 Porto, Portugal
Venue:
International Journal of High Performance Systems Architecture
Year:
2011

Citing 15
Cited 0

Transactional memory: architectural support for lock-free data structures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
MPI: a message passing interface

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Distributed snapshots: determining global states of distributed systems

ACM Transactions on Computer Systems (TOCS)
Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
OpenMP: An Industry-Standard API for Shared-Memory Programming

IEEE Computational Science & Engineering
WaveScalar

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Scaling to the End of Silicon with EDGE Architectures

Computer
Advances in dataflow programming languages

ACM Computing Surveys (CSUR)
The OpenMP Source Code Repository

PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
The Future of Microprocessors

Queue - Multiprocessors
Program Demultiplexing: Data-flow based Speculative Parallelization of Methods in Sequential Programs

Proceedings of the 33rd annual international symposium on Computer Architecture
Modeling instruction placement on a spatial architecture

Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Power-Performance Implications of Thread-level Parallelism on Chip Multiprocessors

ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Transactional WaveCache: Towards Speculative and Out-of-Order DataFlow Execution of Memory Operations

SBAC-PAD '08 Proceedings of the 2008 20th International Symposium on Computer Architecture and High Performance Computing
An efficient algorithm for exploiting multiple arithmetic units

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallel programming has become mandatory to fully exploit the potential of multi-core CPUs. The dataflow model provides a natural way to exploit parallelism. However, specifying dependences and control using fine-grained instructions in dataflow programs can be complex and present unwanted overheads. To address this issue, we have designed TALM: a coarse-grained dataflow execution model to be used on top of widespread architectures. We implemented TALM as the Trebuchet virtual machine for multi-cores. The programmer identifies code blocks that can run in parallel and connects them to form a dataflow graph, which allows one to have the benefits of parallel dataflow execution in a Von Neumann machine, with small programming effort. We parallelised a set of seven applications using our approach and compared with OpenMP implementations. Results show that Trebuchet can be competitive with state-of-the-art technology, while providing the benefits of dataflow execution.