Applied multivariate statistical analysis
Applied multivariate statistical analysis
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Analyzing scalability of parallel algorithms and architectures
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
Portable profiling and tracing for parallel, scientific applications using C++
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Readings in information visualization: using vision to think
Readings in information visualization: using vision to think
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Architectural requirements and scalability of the NAS parallel benchmarks
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Managing performance analysis with dynamic statistical projection pursuit
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Very high resolution simulation of compressible turbulence on the IBM-SP system
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Proceedings of the 14th international conference on Supercomputing
Semicoarsening Multigrid on Distributed Memory Machines
SIAM Journal on Scientific Computing
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
MPI: The Complete Reference
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Software Visualization
Parallel Performance Visualization: From Practice to Theory
IEEE Parallel & Distributed Technology: Systems & Technology
Parallel Performance Evaluation: The MEDEA Tool
HPCN Europe 1996 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
An empirical performance evaluation of scalable scientific applications
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Performance Tool Support for MPI-2 on Linux
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Exploring the Energy-Time Tradeoff in MPI Programs on a Power-Scalable Cluster
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Whodunit: transactional profiling for multi-tier applications
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Scalability analysis of SPMD codes using expectations
Proceedings of the 21st annual international conference on Supercomputing
Robust scalability analysis and SPM case studies
The Journal of Supercomputing
Characterizing the I/O behavior of scientific applications on the Cray XT
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Preserving time in large-scale communication traces
Proceedings of the 22nd annual international conference on Supercomputing
Characterizing application sensitivity to OS interference using kernel-level noise injection
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A decentralized parallel implementation for parallel tempering algorithm
Parallel Computing
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
Journal of Parallel and Distributed Computing
FACT: fast communication trace collection for parallel applications through program slicing
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Diagnosing performance bottlenecks in emerging petascale applications
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
MPInside: a performance analysis and diagnostic tool for MPI applications
Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering
Comparison of execution time decomposition methods for performance evaluation
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
The Cilkview scalability analyzer
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Request distribution in hybrid processing environments
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
International Journal of High Performance Computing Applications
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
ScalaExtrap: trace-based communication extrapolation for spmd programs
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Automatic generation of executable communication specifications from parallel applications
Proceedings of the international conference on Supercomputing
Cache injection for parallel applications
Proceedings of the 20th international symposium on High performance distributed computing
Understanding and Improving Computational Science Storage Access through Continuous Characterization
ACM Transactions on Storage (TOS)
ScalaExtrap: Trace-based communication extrapolation for SPMD programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
ScalaTrace: tracing, analysis and modeling of HPC codes at scale
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Auto-generation of communication benchmark traces
ACM SIGMETRICS Performance Evaluation Review
A scalable infiniband network topology-aware performance analysis tool for MPI
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Elastic and scalable tracing and accurate replay of non-deterministic events
Proceedings of the 27th international ACM conference on International conference on supercomputing
Determination of performance characteristics of scientific applications on IBM Blue Gene/Q
IBM Journal of Research and Development
There goes the neighborhood: performance degradation due to nearby jobs
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Automatic identification of application I/O signatures from noisy server-side traces
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. This unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. This paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their applications design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly.