Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
The effect of context switches on cache performance
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
MemSpy: analyzing memory system bottlenecks in programs
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
A simulation based study of TLB performance
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Design tradeoffs for software-managed TLBs
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The impact of operating system structure on memory system performance
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Computer Architecture; A Quantitative Approach
Computer Architecture; A Quantitative Approach
Mtool: An Integrated System for Performance Debugging Shared Memory Multiprocessor Applications
IEEE Transactions on Parallel and Distributed Systems
Trace-Driven Memory Simulation: A Survey
Performance Evaluation: Origins and Directions
Hi-index | 0.00 |
We used memory reference traces from a DEC Ultrix system running the X11 window system from MIT Project Athena and several freely available X11 applications to measure different aspects of memory system behavior and performance. Our measurements show that memory behavior for X11 workloads differs in several important ways from workloads more traditionally used in cache performance studies. User instruction cache behavior is a major component in overall memory system delays, with significant competition within and between address spaces. User TLB miss rates are up to a factor of two higher than other ill-behaved integer workloads. Write-buffer stalls, data cache behavior, and uncached memory reads can be problematic for microbenchmarks, but they are not an issue for the realistic applications we tested.