A class of compatible cache consistency protocols and their support by the IEEE futurebus
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
A Survey of Some Theoretical Aspects of Multiprocessing
ACM Computing Surveys (CSUR)
Computer Interconnection Structures: Taxonomy, Characteristics, and Examples
ACM Computing Surveys (CSUR)
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Organization and statistical simulation of hierarchical multiprocessors
Organization and statistical simulation of hierarchical multiprocessors
On the inclusion properties for multi-level cache hierarchies
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A simulation study of two-level caches
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Analysis of bus hierarchies for multiprocessors
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A survey of commercial parallel processors
ACM SIGARCH Computer Architecture News - Special Issue: Architectural Support for Operating Systems
Optimul: An optional interconnect for multiprocessor systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
A cache coherence approach for large multiprocessor systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
High-speed implementations of rule-based systems
ACM Transactions on Computer Systems (TOCS)
Simple but effective techniques for NUMA memory management
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Process control and scheduling issues for multiprogrammed shared-memory multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Multi-level shared caching techniques for scalability in VMP-M/C
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Characteristics of performance-optimal multi-level cache hierarchies
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Inexpensive implementations of set-associativity
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Multiple vs. wide shared bus multiprocessors
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Analysis of critical architectural and programming parameters in a hierarchical
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Cache coherence in systems with parallel communication channels & many processors
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Comparative evaluation of latency reducing and tolerating techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Experimental comparison of memory management policies for NUMA multiprocessors
ACM Transactions on Computer Systems (TOCS)
An argument against scalable cache coherency
ACM SIGARCH Computer Architecture News
The Stanford Dash Multiprocessor
Computer
Analysis of directory based cache coherence schemes with multistage networks
CSC '92 Proceedings of the 1992 ACM annual conference on Communications
Towards a shared-memory massively parallel multiprocessor
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Characterizing the caching and synchronization performance of a multiprocessor operating system
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Willow: a scalable shared memory multiprocessor
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
A distributed shared memory multiprocessor ASURA: memory and cache architecture
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Request Combining in Multiprocessors with Arbitrary Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
STiNG: a CC-NUMA computer system for the commercial marketplace
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Verification techniques for cache coherence protocols
ACM Computing Surveys (CSUR)
A memory management unit and cache controller for the MARS system
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Performance of Multistage Bus Networks for a Distributed Shared Memory Multiprocessor
IEEE Transactions on Parallel and Distributed Systems
Retrospective: on the inclusion properties for multi-level cache hierarchies
25 years of the international symposia on Computer architecture (selected papers)
On the inclusion properties for multi-level cache hierarchies
25 years of the international symposia on Computer architecture (selected papers)
Multicast snooping: a new coherence method using a multicast address network
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A program-driven simulation model of an MIMD multiprocessor
ANSS '91 Proceedings of the 24th annual symposium on Simulation
An empirical evaluation of two memory-efficient directory methods
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Design and Evaluation of a Switch Cache Architecture for CC-NUMA Multiprocessors
IEEE Transactions on Computers
Hierarchical Ring Network Configuration and Performance Modeling
IEEE Transactions on Computers
Design and Analysis of Cache Coherent Multistage Interconnection Networks
IEEE Transactions on Computers
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Design of an Adaptive Cache Coherence Protocol for Large Scale Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Performance of Pruning-Cache Directories for Large-Scale Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Performance and Configuration of Hierarchical Ring Networks for Multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling
Proceedings of the 32nd annual international symposium on Computer Architecture
Performance and Reliability of the Multistage Bus Network
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Comparison of Mesh and Hierarchical Networks for Multiprocessors
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
A Distributed Cache Coherence Protocol for Hypercube Multiprocessors
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
A consistency architecture for hierarchical shared caches
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Fractal Coherence: Scalably Verifiable Cache Coherence
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
A workload-adaptive and reconfigurable bus architecture for multicore processors
International Journal of Reconfigurable Computing
Manager-client pairing: a framework for implementing coherence hierarchies
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
The Journal of Supercomputing
High-performance fractal coherence
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.01 |
A new, large scale multiprocessor architecture is presented in this paper. The architecture consists of hierarchies of shared buses and caches. Extended versions of shared bus multicache coherency protocols are used to maintain coherency among all caches in the system. After explaining the basic operation of the strict hierarchical approach, a clustered system is introduced which distributes the memory among groups of processors. Results of simulations are presented which demonstrate that the additional coherency protocol overhead introduced by the clustered approach is small. The simulations also show that a 128 processor multiprocessor can be constructed using this architecture which will achieve a substantial fraction of its peak performance. Finally, an analytic model is used to explore systems too large to simulate (with available hardware). The model indicates that a system of over 1000 usable MIPS can be constructed using high performance microprocessors.