Parallel algorithms and architectures for rule-based systems
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Analysis of cache performance for operating systems and multiprogramming
Analysis of cache performance for operating systems and multiprogramming
ACM Computing Surveys (CSUR)
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Evaluating the performance of software cache coherence
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
The effect of sharing on the cache and bus performance of parallel programs
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Organization and performance of a two-level virtual-real cache hierarchy
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
TRAPEDS: producing traces for multicomputers via execution driven simulation
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
C2MP: a cache-coherent, distributed memory multiprocessor-system
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Efficient trace-driven simulation method for cache performance analysis
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Techniques for efficient inline tracing on a shared-memory multiprocessor
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Munin: distributed shared memory based on type-specific memory coherence
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Address Tracing for Parallel Machines
Computer - Special issue on experimental research in computer architecture
On-line caching as cache size varies
SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
Efficient trace-driven simulation methods for cache performance analysis
ACM Transactions on Computer Systems (TOCS)
Page placement algorithms for large real-indexed caches
ACM Transactions on Computer Systems (TOCS)
The effect of page allocation on caches
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Exploring the design space for a shared-cache multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Trap-driven simulation with Tapeworm II
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Instruction fetching: coping with code bloat
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Trap-driven memory simulation with Tapeworm II
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Trace-driven memory simulation: a survey
ACM Computing Surveys (CSUR)
The selection of optimal cache lines for microprocessor-based controllers
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
An evaluation of directory schemes for cache coherence
25 years of the international symposia on Computer architecture (selected papers)
Adaptive software cache management for distributed shared memory architectures
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
IEEE Transactions on Computers
Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors
IEEE Parallel & Distributed Technology: Systems & Technology
An Analysis of Cache Performance for a Hypercube Multicomputer
IEEE Transactions on Parallel and Distributed Systems
Performance Tradeoffs in Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Benchmarks and Standards for the Evaluation of Parallel Job Schedulers
IPPS/SPDP '99/JSSPP '99 Proceedings of the Job Scheduling Strategies for Parallel Processing
On adequate performance measures for paging
Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
On the relative dominance of paging algorithms
Theoretical Computer Science
On the relative dominance of paging algorithms
ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
Hi-index | 0.00 |
The design of high-performance multiprocessor systems necessitates a careful analysis of the memory system performance of parallel programs. Lacking multiprocessor address traces, previous multiprocessor performance studies using analytical models had to make an inordinate number of assumptions about the underlying memory reference patterns. We previously developed a scheme called ATUM - Address Tracing Using Microcode - to get reliable operating system and multiprogramming traces on single processors. This paper briefly describes the multiprocessor extension of ATUM and its implementation on a VAX 8350 multiprocessor. We also report on our use of the resulting traces to analyze physical versus virtual addressing of large caches, process-identifier hashing in virtual caches, cache interference between multiple processes, cache interference between multiple CPUs, process affinity, and semaphore usage in writeback caches.