Mache: no-loss trace compaction
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Efficient representations and abstractions for quantifying and exploiting data reference locality
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Introduction to Information Theory and Data Compression
Introduction to Information Theory and Data Compression
METRIC: tracking down inefficiencies in the memory hierarchy via binary rewriting
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Compactly representing parallel program executions
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Linear-Time, Incremental Hierarchy Inference for Compression
DCC '97 Proceedings of the Conference on Data Compression
VPC3: a fast and effective trace-compression algorithm
Proceedings of the joint international conference on Measurement and modeling of computer systems
Automatic Generation of High-Performance Trace Compressors
Proceedings of the international symposium on Code generation and optimization
Automatic Construction and Evaluation of Performance Skeletons
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Automatic pool allocation: improving performance by controlling data structure layout in the heap
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The VPC Trace-Compression Algorithms
IEEE Transactions on Computers
Stream-Based Trace Compression
IEEE Computer Architecture Letters
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies
ACM Transactions on Programming Languages and Systems (TOPLAS)
Predicting locality phases for dynamic memory optimization
Journal of Parallel and Distributed Computing
Dynamic Characteristics of Loops
IEEE Transactions on Computers
Performance prediction with skeletons
Cluster Computing
Prediction and trace compression of data access addresses through nested loop recognition
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Automatic software interference detection in parallel applications
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Preserving time in large-scale communication traces
Proceedings of the 22nd annual international conference on Supercomputing
SCALASCA Parallel Performance Analyses of SPEC MPI2007 Applications
SIPEW '08 Proceedings of the SPEC international workshop on Performance Evaluation: Metrics, Models and Benchmarks
IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Verifying Causality between Distant Performance Phenomena in Large-Scale MPI Applications
PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
Journal of Parallel and Distributed Computing
Identifying hierarchical structure in sequences: a linear-time algorithm
Journal of Artificial Intelligence Research
Scalable I/O tracing and analysis
Proceedings of the 4th Annual Workshop on Petascale Data Storage
The periodic-linear model of program behavior capture
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Deterministic replay for message-passing-based concurrent programs
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special section on verification challenges in the concurrent world
Elastic and scalable tracing and accurate replay of non-deterministic events
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.00 |
Characterizing the communication behavior of parallel programs through tracing can help understand an application’s characteristics, model its performance, and predict behavior on future systems. However, lossless communication traces can get prohibitively large, causing programmers to resort to variety of other techniques. In this paper, we present a novel approach to lossless communication trace compression. We augment the sequitur compression algorithm to employ it in communication trace compression of parallel programs. We present optimizations to reduce the memory overhead, reduce size of the trace files generated, and enable compression across multiple processes in a parallel program. The evaluation shows improved compression and reduced overhead over other approaches, with up to 3 orders of magnitude improvement for the NAS MG benchmark. We also observe that, unlike existing schemes, the trace files sizes and the memory overhead incurred are less sensitive to, if not independent of, the problem size for the NAS benchmarks.