The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Accurate and efficient regression modeling for microarchitectural performance and power prediction
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Pinpointing and Exploiting Opportunities for Enhancing Data Reuse
ISPASS '08 Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software
Evaluation techniques for storage hierarchies
IBM Systems Journal
Accelerating multicore reuse distance analysis with sampling and parallelization
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Discovery of locality-improving refactorings by reuse path analysis
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
RDVIS: a tool that visualizes the causes of low locality and hints program optimizations
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Is reuse distance applicable to data locality analysis on chip multiprocessors?
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Hi-index | 0.00 |
The prevalence of multicore architectures has made the performance analysis of multithreaded applications an intriguing area of inquiry. An understanding of locality effects and communication behavior can provide programmers with valuable information about performance bottlenecks and opportunities for optimization. Unfortunately, most performance analyses are architecture dependent, and hence insights gleaned from an application's behavior on one platform may not apply when the application is run on another. In this position paper, we argue that what is needed are architecture independent metrics that characterize the behavior of an application in a system-agnostic manner. Such metrics will allow a program's performance to be analyzed across a range of architectures without incurring the overhead of repeated profiling and analysis. We propose two specific analyses: multicore-aware reuse distance, which captures the locality properties of an application and communication analysis, which exposes the structure of communication in an application. We also discuss a number of applications of these analyses, in the domains of optimization, code restructuring and performance modeling.