Improved algorithms for synchronizing computer network clocks
IEEE/ACM Transactions on Networking (TON)
A new switch chip for IBM RS/6000 SP systems
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
From trace generation to visualization: a performance framework for distributed parallel systems
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A framework for reducing the cost of instrumented code
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Workload Characterization Issues and Methodologies
Performance Evaluation: Origins and Directions
EARL - A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
SimGrid: A Generic Framework for Large-Scale Distributed Experiments
UKSIM '08 Proceedings of the Tenth International Conference on Computer Modeling and Simulation
Software Architecture Patterns for a Context-Processing Middleware Framework
IEEE Distributed Systems Online
A flexible and scalable experimentation layer
Proceedings of the 40th Conference on Winter Simulation
Tools for scalable parallel program analysis: Vampir NG, MARMOT, and DeWiz
International Journal of Computational Science and Engineering
Optimized InfiniBandTM fat-tree routing for shift all-to-all communication patterns
Concurrency and Computation: Practice & Experience - International Supercomputing Conference (ISC07)
OSIF: a framework to instrument, validate, and analyze simulations
Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Trace-based performance analysis framework for heterogeneous multicore systems
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Scalable parallel trace-based performance analysis
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Hi-index | 0.00 |
In order to study the performance of scheduling algorithms, simulators of parallel and distributed applications need accurate models of the application's behavior during execution. For this purpose, traces of low-level events collected during the actual execution of real applications are needed. Collecting such traces is a difficult task due to the timing, to the interference of instrumentation code, and to the storage and transfer of the collected data. To address this problem we propose a comprehensive software architecture, which instruments the application's executables, gather hierarchically the traces, and post-process them in order to feed simulation models. We designed it to be scalable, modular and extensible.