Reflections on the memory wall
Proceedings of the 1st conference on Computing frontiers
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Efficient computation of sum-products on GPUs through software-managed cache
Proceedings of the 22nd annual international conference on Supercomputing
A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor
Languages and Compilers for Parallel Computing
BCPL: a tool for compiler writing and system programming
AFIPS '69 (Spring) Proceedings of the May 14-16, 1969, spring joint computer conference
Hera-JVM: a runtime system for heterogeneous multi-core architectures
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
The 48-core SCC Processor: the Programmer's View
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Programming heterogeneous multicore systems using threading building blocks
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Offload – automating code migration to heterogeneous multicore systems
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Automatic analysis of scratch-pad memory code for heterogeneous multicore processors
TACAS'10 Proceedings of the 16th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Hi-index | 0.00 |
Memory architectures need to adapt in order for performance and scalability to be achieved in software for multicore systems. In this paper, we discuss the impact of techniques for scalable memory architectures, especially the use of multiple, non-cache-coherent memory spaces, on the implementation and performance of consumer software. Primarily, we report extensive real-world experience in this area gained by Codeplay Software Ltd., a software tools company working in the area of compilers for video games and GPU software. We discuss the solutions we use to handle variations in memory architecture in consumer software, and the impact such variations have on software development effort and, consequently, development cost. This paper introduces preliminary findings regarding impact on software, in advance of a larger-scale analysis planned over the next few years. The techniques discussed have been employed successfully in the development and optimisation of a shipping AAA cross-platform video game.