COBRA fundamentals and programming
COBRA fundamentals and programming
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE
CANPC '99 Proceedings of the Third International Workshop on Network-Based Parallel Computing: Communication, Architecture, and Applications
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
True Shared Memory Programming on SCI-Based Clusters
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
SCI Monitoring Hardware and Software: Supporting Performance Evaluation and Debugging
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
State Based Visualization of PVM Applications
EuroPVM '96 Proceedings of the Third European PVM Conference on Parallel Virtual Machine
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Improving Data Locality Using Dynamic Page Migration Based on Memory Access Histograms
ICCS '02 Proceedings of the International Conference on Computational Science-Part II
SMP system interconnect instrumentation for performance analysis
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A proposal for a new hardware cache monitoring architecture
Proceedings of the 2002 workshop on Memory system performance
Interactive locality optimization on NUMA architectures
Proceedings of the 2003 ACM symposium on Software visualization
ARS: an adaptive runtime system for locality optimization
Future Generation Computer Systems - Tools for program development and analysis
Memory access behavior analysis of NUMA-based shared memory programs
Scientific Programming
Hi-index | 0.00 |
Data locality is one of the most important issues affecting the performance of shared memory applications on NUMA architectures. A possibility to improve data locality is the specification of a correct data layout within the source code. This kind of optimization, however, requires in depth knowledge about the run-time memory access behavior of programs. In order to acquire this knowledge without causing a probe overhead, as it would be caused by software instrumentation approaches, it is necessary to adopt a hardware performance monitor that can provide detailed information about memory transactions. As the monitored information is usually very low-level and not user-readable, a visualization tool is necessary as well. This paper presents such a visualization tool displaying the monitored data in a user understandable way thereby showing the memory access behavior of shared memory applications. In addition, it projects the physical addresses in the memory transactions back to the data structures within the source code. This increases a programmer's ability to effectively understand, develop, and optimize programs.