Time skewing made simple

Authors:
Robert Strzodka;Mohammed Shaheen;Dawid Pajak
Affiliations:
Max Planck Institut Informatik, Saarbrücken, Germany;Max Planck Institut Informatik, Saarbrücken, Germany;Pomeranian University of Technology, Szczecin, Poland
Venue:
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Year:
2011

Citing 10
Cited 0

Using Time Skewing to Eliminate Idle Time due to Memory Bandwidth and Network Limitations

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Cache oblivious stencil computations

Proceedings of the 19th annual international conference on Supercomputing
The cache complexity of multithreaded cache oblivious algorithms

Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Implicit and explicit optimizations for stencil computations

Proceedings of the 2006 workshop on Memory system performance and correctness
Multi-level tiling: M for the price of one

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Parametric multi-level tiling of imperfectly nested loops

Proceedings of the 23rd international conference on Supercomputing
Improving parallelism and locality with asynchronous algorithms

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Parameterized tiling revisited

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Cache oblivious parallelograms in iterative stencil computations

Proceedings of the 24th ACM International Conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Time skewing and loop tiling has been known for a long time to be a highly beneficial acceleration technique for nested loops especially on bandwidth hungry multi-core processors, but it is little used in practice because efficient implementations utilize complicated code and simple or abstract ones show much smaller gains over naive nested loops. We break this dilemma with an essential time skewing scheme that is both compact and fast.