Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A Dynamic Periodicity Detector: Application to Speedup Computation
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Asserting performance expectations
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Initial Design of a Test Suite for Automatic Performance Analysis Tools
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
High Performance Event Trace Visualization
PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
Image Analysis and Mathematical Morphology
Image Analysis and Mathematical Morphology
Using Dynamic Tracing Sampling to Measure Long Running Programs
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Tools for scalable parallel program analysis: Vampir NG, MARMOT, and DeWiz
International Journal of Computational Science and Engineering
Preserving time in large-scale communication traces
Proceedings of the 22nd annual international conference on Supercomputing
Automatic analysis of speedup of MPI applications
Proceedings of the 22nd annual international conference on Supercomputing
Scalable load-balance measurement for SPMD codes
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Load balancing using dynamic cache allocation
Proceedings of the 7th ACM international conference on Computing frontiers
Automatic Phase Detection and Structure Extraction of MPI Applications
International Journal of High Performance Computing Applications
Detailed performance analysis using coarse grain sampling
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Automatic generation of executable communication specifications from parallel applications
Proceedings of the international conference on Supercomputing
Scalable fine-grained call path tracing
Proceedings of the international conference on Supercomputing
FELI: HW/SW support for on-chip distributed shared memory in multicores
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Trace-based performance analysis for the petascale simulation code FLASH
International Journal of High Performance Computing Applications
Can manycores support the memory requirements of scientific applications?
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Auto-generation of communication benchmark traces
ACM SIGMETRICS Performance Evaluation Review
Hi-index | 0.00 |
The process of obtaining useful message passing applications tracefiles for performance analysis in supercomputers is a large and tedious task. When using hundreds or thousands of processors, the tracefile size can grow up to 10 or 20 GB. It is clear that analyzing or even storing these large traces is a problem. The methodology we have developed and implemented performs an automatic analysis that can be applied to huge tracefiles, which obtains its internal structure and selects meaningful parts of the tracefile. The paper presents the methodology and results we have obtained from real applications.