Estimating interlock and improving balance for pipelined architectures
Journal of Parallel and Distributed Computing
EEL: machine-independent executable editing
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
An integrated compilation and performance analysis environment for data parallel programs
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Integrating performance monitoring and communication in parallel computers
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Waiting time analysis and performance visualization in Carnival
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Nesting of reducible and irreducible loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
Continuous profiling: where have all the cycles gone?
Proceedings of the sixteenth ACM symposium on Operating systems principles
ProfileMe: hardware support for instruction-level profiling on out-of-order processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Performance analysis using the MIPS R10000 performance counters
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Tools for application-oriented performance tuning
ICS '01 Proceedings of the 15th international conference on Supercomputing
On providing useful information for analyzing and tuning applications
Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Parallel performance prediction using lost cycles analysis
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Instruction-Level Microprocessor Modeling of Scientific Applications
ISHPC '99 Proceedings of the Second International Symposium on High Performance Computing
An evaluation of global address space languages: co-array fortran and unified parallel C
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Low-overhead call path profiling of unmodified, optimized code
Proceedings of the 19th annual international conference on Supercomputing
PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
Intermediately executed code is the key to find refactorings that improve temporal data locality
Proceedings of the 3rd conference on Computing frontiers
Assessing the potential of hybrid hpc systems for scientific applications: a case study
Proceedings of the 4th international conference on Computing frontiers
Compensation of Measurement Overhead in Parallel Performance Profiling
International Journal of High Performance Computing Applications
Scalability analysis of SPMD codes using expectations
Proceedings of the 21st annual international conference on Supercomputing
Scaling Properties of Common Statistical Operators for Gridded Datasets
International Journal of High Performance Computing Applications
A productivity centered application performance tuning framework
Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
Knowledge support and automation for performance analysis with PerfExplorer 2.0
Scientific Programming - Large-Scale Programming Tools and Environments
Open | SpeedShop: An open source infrastructure for parallel performance analysis
Scientific Programming - Large-Scale Programming Tools and Environments
A component infrastructure for performance and power modeling of parallel scientific applications
Proceedings of the 2008 compFrame/HPC-GECO workshop on Component based high performance
Binary analysis for measurement and attribution of program performance
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Performance Analysis Framework for High-Level Language Applications in Reconfigurable Computing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A compiler approach to performance prediction using empirical-based modeling
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
International Journal of High Performance Computing Applications
Automatic performance debugging of SPMD-style parallel programs
Journal of Parallel and Distributed Computing
Reducing the overhead of direct application instrumentation using prior static analysis
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
RDVIS: a tool that visualizes the causes of low locality and hints program optimizations
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Understanding and detecting real-world performance bugs
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Cache Conscious Task Regrouping on Multicore Processors
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Novel views of performance data to analyze large-scale adaptive applications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Detecting application load imbalance on high end massively parallel systems
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
It is increasingly difficult for complex scientific programs to attain a significant fraction of peak performance on systems that are based on microprocessors with substantial instruction-level parallelism and deep memory hierarchies. Despite this trend, performance analysis and tuning tools are still not used regularly by algorithm and application designers. To a large extent, existing performance tools fail to meet many user needs and are cumbersome to use. To address these issues, we developed HPCVIEW—a toolkit for combining multiple sets of program profile data, correlating the data with source code, and generating a database that can be analyzed anywhere with a commodity Web browser. We argue that HPCVIEW addresses many of the issues that have limited the usability and the utility of most existing tools. We originally built HPCVIEW to facilitate our own work on data layout and optimizing compilers. Now, in addition to daily use within our group, HPCVIEW is being used by several code development teams in DoD and DoE laboratories as well as at NCSA.