Compilation for explicitly managed memory hierarchies
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
CellSs: making it easier to program the cell broadband engine processor
IBM Journal of Research and Development
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
International Journal of Parallel Programming
Evaluation of memory performance on the cell BE with the SARC programming model
Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture
Optimizing the use of static buffers for DMA on a CELL chip
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Extending the OpenMP tasking model to allow dependent tasks
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Hi-index | 0.00 |
Current heterogeneous multicore architectures, including the Cell/B.E., GPUs, and future developments, like Larrabee, require enormous programming efforts to efficiently run current parallel applications, achieving high performance. In this paper, we want to present the results we obtain from the coding with the SARC Programming Model, of two benchmarks, matrix multiply and conjugate gradient (NAS CG), with respect memory bandwidth. We show some sample loops annotated and the experience that we got trying to have our system executing them efficienly. Results indicate that the programming model is able to achieve up to 85% of the peak memory bandwidth on the Cell/B.E. processor.