Reducing memory latency via non-blocking and prefetching caches
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Traveling threads: a new multithreaded execution model
Traveling threads: a new multithreaded execution model
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
MAUI: making smartphones last longer with code offload
Proceedings of the 8th international conference on Mobile systems, applications, and services
Efficient data consolidation in grid networks and performance analysis
Future Generation Computer Systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
For communication-intensive applications on distributed memory systems, performance is bounded by remote memory accesses. Task migration is a potential candidate for reducing network traffic in such applications, thereby improving performance. We seek to answer the question: can a run-time profitably predict when it is better to move the task to the data than move the data to the task? Using a simple model where local work is free and data transferred over the network is costly, we show that a best case task migration schedule can achieve up to 3.5× less total data transferred than no migration for some benchmarks. Given this observation, we develop and evaluate two online task migration policies: Stream Predictor, which uses only immediate remote access history, and Hindsight Migrate, which tracks instruction addresses where task migration is predicted to be beneficial. These predictor policies are able to provide benefit over execution with no migration for small or moderate size tasks on our tested applications.