Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Automated cache optimizations using CME driven diagnosis
Proceedings of the 14th international conference on Supercomputing
Combined partitioning and data padding for scheduling multiple loop nests
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Computer
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
The Augmint multiprocessor simulation toolkit for Intel x86 architectures
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Visualizing the Impact of the Cache on Program Execution
IV '01 Proceedings of the Fifth International Conference on Information Visualisation
Identifying and Exploiting Spatial Regularity in Data Memory References
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning
Proceedings of the International Symposium on Code Generation and Optimization
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
Fast compiler optimisation evaluation using code-feature based performance prediction
Proceedings of the 4th international conference on Computing frontiers
YACO: a user conducted visualization tool for supporting cache optimization
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Editorial: Special section: Tools for program development and analysis in computational science
Future Generation Computer Systems
Bus and memory protection through chain-generated and tree-verified IV for multiprocessors systems
Future Generation Computer Systems
Hi-index | 0.00 |
Processor speed is increasing exponentially, while the increase in memory speed is relatively slow. This results in the fact that the overall performance of a computing system is increasingly contained by the memory performance. This paper describes an approach for improving the cache hit ratio and thereby the efficiency of the memory system. The approach is based on a set of performance tools which are capable of presenting the cache problems, the reason for them, and the solution for tackling them.