Behavioral characterization of multiprocessor memory systems: a case study
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Computer Architecture and Parallel Processing
Computer Architecture and Parallel Processing
Behavioral characterization of decoupled access/execute architecture
ICS '91 Proceedings of the 5th international conference on Supercomputing
Dynamic and static load scheduling performance on a NUMA shared memory multiprocessor
ICS '91 Proceedings of the 5th international conference on Supercomputing
Performance Prediction and Evaluation of Parallel Processing on a NUMA Multiprocessor
IEEE Transactions on Software Engineering
Characterizing memory performance in vector multiprocessors
ICS '92 Proceedings of the 6th international conference on Supercomputing
ICS '93 Proceedings of the 7th international conference on Supercomputing
Precise compile-time performance prediction for superscalar-based computers
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Impact of Memory Contention on Dynamic Scheduling on NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Compiler optimization-space exploration
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
System-level design space exploration for security processor prototyping in analytical approaches
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
The Journal of Supercomputing
Microprocessors & Microsystems
Power-Aware scheduling for parallel security processors with analytical models
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Hi-index | 0.00 |
The techniques of “load/store” memory reference modeling is based on deriving performance characteristics of the memory architecture of a computer by looking at the behavior of simple sequences of load, store and nop (null operation) instructions. The resulting data base can be used to match load/store templates against algorithm kernels to predict performance or as a source of data for testing analytical models of the architecture. In this paper we study the BBN GP1000 parallel processing system. We show how to build a subset of the load/store kernels needed to characterize the machine and illustrate the behavior of a simple model based on the data.