The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The SHRIMP performance monitor: design and applications
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Thread migration and its applications in distributed shared memory systems
Journal of Systems and Software
IEEE Standard for Scalable Coherent Interface, Science: IEEE Std. 1596-1992
IEEE Standard for Scalable Coherent Interface, Science: IEEE Std. 1596-1992
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
SCI-VM: A Flexible Base for Transparent Shared Memory Programming Models on Clusters of PCs
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE
CANPC '99 Proceedings of the Third International Workshop on Network-Based Parallel Computing: Communication, Architecture, and Applications
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
SVMview: A Performance Tuning Tool for DSM-Based Parallel Computers
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
PCI-SCI Protocol Translations: Applying Microprogramming Concepts to FPGAs
FPL '98 Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm
Optimizing Data Locality for SCI-Based PC-Clusters with the SMiLE Monitoring Approach
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Hi-index | 0.00 |
High computational demands are one of the main reasons for the use of parallel architectures like clusters of PCs. Many parallel programs, however, suffer from severe inefficiencies when executed on such a loosely coupled architecture for a variety of reasons. One of the most important is the frequent access to remote memories. In this article, we present a hybrid event-driven monitoring system which uses a hardware monitor to observe all of the underlying transactions on the network and to deliver information about the run-time behavior of parallel programs to tools for performance analysis and debugging. This monitoring system is targeted towards cluster architectures with NUMA characteristics.