Page table management in local/remote architectures
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Characterizing the synchronization behavior of parallel programs
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Recoverable Distributed Shared Virtual Memory
IEEE Transactions on Computers
Parallel program behavioral study on a shared-memory multiprocessor
ICS '91 Proceedings of the 5th international conference on Supercomputing
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Traffic studies of unbuffered Delta networks
IBM Journal of Research and Development
Architectural requirements of parallel scientific applications with explicit communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Accuracy of Memory Reference Traces of Parallel Computations in Trace-Drive Simulation
IEEE Transactions on Parallel and Distributed Systems
Quantifying Locality In The Memory Access Patterns of HPC Applications
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Predicting locality phases for dynamic memory optimization
Journal of Parallel and Distributed Computing
Characteristics of workloads used in high performance and technical computing
Proceedings of the 21st annual international conference on Supercomputing
Analysis of input-dependent program behavior using active profiling
Proceedings of the 2007 workshop on Experimental computer science
Analysis of input-dependent program behavior using active profiling
ecs'07 Experimental computer science on Experimental computer science
Program phase detection and exploitation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
A parallel simulator, PSIMUL, has been used to collect information on the memory access patterns and synchronization overheads of several scientific applications. The parallel simulation method we use is very efficient and it allows us to simulate execution of an entire application program, amounting to hundreds of millions of instructions. We present our measurements on the memory access characteristics of these applications; particularly our observations on shared and private data, their frequency of access and locality. We have found that, even though the shared data comprise the largest portion of the data in the application program, on the average a small fraction of the memory references are to shared data. The low averages do not preclude bursts of traffic to shared memory nor does it rule out positive benefits from caching shared data. We also discuss issues of synchronization overheads and their effect on performance.