Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
Multiprocessor scheduling to account for interprocessor communication
Multiprocessor scheduling to account for interprocessor communication
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Bounded scheduling of process networks
Bounded scheduling of process networks
LoGPC: Modeling Network Contention in Message-Passing Programs
IEEE Transactions on Parallel and Distributed Systems
SimpleFit: A Framework for Analyzing Design Trade-Offs in Raw Architectures
IEEE Transactions on Parallel and Distributed Systems
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Throughput Analysis of Synchronous Data Flow Graphs
ACSD '06 Proceedings of the Sixth International Conference on Application of Concurrency to System Design
Efficient Techniques for Clustering and Scheduling onto Embedded Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs
Proceedings of the 44th annual Design Automation Conference
3G Evolution, Second Edition: HSPA and LTE for Mobile Broadband
3G Evolution, Second Edition: HSPA and LTE for Mobile Broadband
Hi-index | 0.00 |
The programming complexity of increasingly parallel processors calls for new tools to assist programmers in utilising the parallel hardware resources. In this paper we present a set of models that we have developed to form part of a tool which is intended for iteratively tuning the mapping of dataflow graphs onto manycores. One of the models is used for capturing the essentials of manycores that are identified as suitable for signal processing and which we use as target architectures. Another model is the intermediate representation in the form of a timed configuration graph, describing the mapping of a dataflow graph onto a machine model. Moreover, this IR can be used for performance evaluation using abstract interpretation. We demonstrate how the models can be configured and applied in order to map applications on the Raw processor. Furthermore, we report promising results on the accuracy of performance predictions produced by our tool. It is also demonstrated that the tool can be used to rank different mappings with respect to optimisation on throughput and end-to-end latency.