The high performance Fortran handbook
The high performance Fortran handbook
Preliminary experiences with the Fortran D compiler
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
The paragon performance monitoring environment
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Experimental analysis of parallel systems: techniques and open problems
Proceedings of the 7th international conference on Computer performance evaluation : modelling techniques and tools: modelling techniques and tools
IPS-2: The Second Generation of a Parallel Program Measurement System
IEEE Transactions on Parallel and Distributed Systems
Performance Instrumentation Techniques for Parallel Systems
Performance Evaluation of Computer and Communication Systems, Joint Tutorial Papers of Performance '93 and Sigmetrics '93
Automating parallel runtime optimizations using post-mortem analysis
ICS '96 Proceedings of the 10th international conference on Supercomputing
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Analytic evaluation of shared-memory systems with ILP processors
Proceedings of the 25th annual international symposium on Computer architecture
SUIF Explorer: an interactive and interprocedural parallelizer
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Integrated Range Comparison for Data-Parallel Compilation Systems
IEEE Transactions on Parallel and Distributed Systems
ZPL: A Machine Independent Programming Language for Parallel Computers
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools for parallel processing
Execution-driven performance analysis for distributed and parallel systems
Proceedings of the 2nd international workshop on Software and performance
Towards an integrated, web-executable parallel programming tool environment
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Loop Transformations for Architectures with Partitioned Register Banks
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
A comparison of automatic parallelization tools/compilers on the SGI origin 2000
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Loop fusion for clustered VLIW architectures
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
HPCVIEW: A Tool for Top-down Analysis of Node Performance
The Journal of Supercomputing
Performance Modeling and Composition: A Case Study in Cell Simulation
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Optimizing Loop Performance for Clustered VLIW Architectures
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Performance Issues in Parallel Processing Systems
Performance Evaluation: Origins and Directions
Performance visualization for distributed shared memory systems
Virtual shared memory for distributed architectures
TEST: a tracer for extracting speculative threads
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Using thread-level speculation to simplify manual parallelization
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
A Rule-based Approach for Automatic Bottleneck Detection in Programs on Shared
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
I/O, Performance Analysis, and Performance Data Immersion
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
The Jrpm system for dynamically parallelizing Java programs
Proceedings of the 30th annual international symposium on Computer architecture
Tools for performance tuning and debugging
Sourcebook of parallel computing
Sourcebook of parallel computing
An experimental evaluation of scalar replacement on scientific benchmarks
Software—Practice & Experience
Linear data distribution based on index analysis
High performance scientific and engineering computing
An Application Analysis Framework For Polymorphic Chip Multiprocessors
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Scaling applications to massively parallel machines using Projections performance analysis tool
Future Generation Computer Systems
The rise and fall of High Performance Fortran: an historical object lesson
Proceedings of the third ACM SIGPLAN conference on History of programming languages
Parallel programming environment for OpenMP
Scientific Programming
Software thread-level speculation: an optimistic library implementation
Proceedings of the 1st international workshop on Multicore software engineering
An entropy-based algorithm for data elimination in time-driven software instrumentation
Journal of Systems and Software
Binary analysis for measurement and attribution of program performance
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A lightweight in-place implementation for software thread-level speculation
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Scaling molecular dynamics to 3000 processors with projections: a performance analysis case study
ICCS'03 Proceedings of the 2003 international conference on Computational science
An entropy-based algorithm for time-driven software instrumentation in parallel systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
An idiom-finding tool for increasing productivity of accelerators
Proceedings of the international conference on Supercomputing
Kismet: parallel speedup estimates for serial programs
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Proceedings of the 6th International Systems and Storage Conference
ASC: automatically scalable computation
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Supporting source-level performance analysis of programs written in data-parallel languages requires a unique degree of integration between compilers and performance analysis tools. Compilers for languages such as High Performance Fortran infer parallelism and communication from data distribution directives, thus, performance tools cannot meaningfully relate measurements about these key aspects of execution performance to source-level constructs without substantial compiler support. This paper describes an integrated system for performance analysis of data-parallel programs based on the Rice Fortran 77D compiler and the Illinois Pablo performance analysis toolkit. During code generation, the Fortran D compiler records mapping information and semantic analysis results describing the relationship between performance instrumentation and the original source program. An integrated performance analysis system based on the Pablo toolkit uses this information to correlate the program's dynamic behavior with the data parallel source code. The integrated system provides detailed source-level performance feedback to programmers via a pair of graphical interfaces. Our strategy serves as a model for integration of data-parallel compilers and performance tools.