Quantitative system performance: computer system analysis using queueing network models
Quantitative system performance: computer system analysis using queueing network models
Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
VLSI assist for a multiprocessor
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
Performance tradeoffs in cache design
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A Case for Direct-Mapped Caches
Computer
A mean-value performance analysis of a new multiprocessor architecture
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
ACM Transactions on Computer Systems (TOCS)
Performance analysis of hierarchical cache-consistent multiprocessors
Performance Evaluation - Selected papers from the international seminar on performance of distributed and parallel systems
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
Implementing a cache consistency protocol
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Aspects of Cache Memory and Instruction
Aspects of Cache Memory and Instruction
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
An analytical model of high performance superscalar-based multiprocessors
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
A Subsystem-Oriented Performance Analysis Methodology for Shared-Bus Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
LoPC: modeling contention in parallel algorithms
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Analytic evaluation of shared-memory systems with ILP processors
Proceedings of the 25th annual international symposium on Computer architecture
ICOS: an intelligent concurrent object-oriented synthesis methodology for multiprocessor systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
An Easy-to-Use Approach for Practical Bus-Based System Design
IEEE Transactions on Computers
AMVA techniques for high service time variability
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Multiprocessor Bus Design Model Validated by System Measurement
IEEE Transactions on Parallel and Distributed Systems
Tradeoffs in the Design of Single Chip Multiprocessors
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Interface Design Techniques for Single-Chip Systems
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Comprehensive multiprocessor cache miss rate generation using multivariate models
ACM Transactions on Computer Systems (TOCS)
Exploiting locality to ameliorate packet queue contention and serialization
Proceedings of the 3rd conference on Computing frontiers
HIBI Communication Network for System-on-Chip
Journal of VLSI Signal Processing Systems
Comprehensive multivariate extrapolation modeling of multiprocessor cache miss rates
ACM Transactions on Computer Systems (TOCS)
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 14.98 |
The authors consider the evaluation of design choices in multiprocessors with a single, shared bus interconnect operating in an environment in which each task is being executed on a single processor and the performance of the multiprocessor is measured by its overall throughput. To evaluate design choices, they develop mean value analysis analytical models and validate the models by comparing their results against the results of a trace-driven simulation analysis for 5376 multiprocessor configurations. The trace-driven simulation uses actual programs and simulates their execution in a throughput-oriented environment. It is found that: (1) cache block sizes that yield the best performance in a multiprocessor differ from the block sizes that yield the best uniprocessor performance metrics, (2) a larger cache set associativity might be warranted in a multiprocessor even though it might not be warranted in a uniprocessor, (3) a split transaction, pipelined bus yields much higher multiprocessor throughput than a circuit switched bus, especially for larger main memory latencies, and (4) increasing the bus width appears to be an effective way of improving multiprocessor throughput.