Measuring VAX 8800 performance with a histogram hardware monitor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Stanford Dash Multiprocessor
Computer
The SHRIMP performance monitor: design and applications
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Trace-driven memory simulation: a survey
ACM Computing Surveys (CSUR)
Using hardware performance monitors to isolate memory bottlenecks
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
The sun fireplane system interconnect
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
The DASH Prototype: Logic Overhead and Performance
IEEE Transactions on Parallel and Distributed Systems
Visualizing the Memory Access Behavior of Shared Memory Applications on NUMA Architectures
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE
CANPC '99 Proceedings of the Third International Workshop on Network-Based Parallel Computing: Communication, Architecture, and Applications
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Ultra-high performance communication with MPI and the Sun fire™ link interconnect
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A Characterization of Processor Performance in the vax-11/780
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
On using a hardware monitor as an intelligent peripheral
ACM SIGMETRICS Performance Evaluation Review
Ultra-high performance communication with MPI and the Sun fire™ link interconnect
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Using Hardware Counters to Automatically Improve Memory Performance
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
NUMA-Aware Java Heaps for Server Applications
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Fast data-locality profiling of native execution
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Hardware monitors for dynamic page migration
Journal of Parallel and Distributed Computing
Enhancing operating system support for multicore processors by using hardware performance monitoring
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
The system interconnect is often the performance bottleneck in SMP computers. Although modern SMPs include event counters on processors and interconnects, these provide limited information about the interaction of processors vying for shared resources. Additionally, transaction sources and addresses are not readily available, making analysis of access patterns and data locality difficult. Enhanced system interconnect instrumentation is required to extract this information.This paper describes instrumentation implemented for monitoring the system interconnect on Sun Fire™ servers. The instrumentation supports sophisticated programmable filtering of event counters, allowing us to construct histograms of system interconnect activity, and a FIFO to capture trace sequences. Our implementation results in a very small hardware footprint, making it appropriate for inclusion in commodity hardware.We also describe a sampling of software tools and results based on this infrastructure. Applications have included performance profiling, architectural studies, and hardware brin-gup and debugging.