Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
A class of compatible cache consistency protocols and their support by the IEEE futurebus
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Efficient execution of fine-grain parallelism on a tightly-coupled multiprocessor
Journal of Information Processing
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Performance measurements on HEP - a pipelined MIMD computer
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
MBCF: a protected and virtualized high-speed user-level memory-based communication facility
ICS '98 Proceedings of the 12th international conference on Supercomputing
Hi-index | 0.00 |
In multiprocessor systems, overheads caused by interprocessor communication and synchronization have been one of the largest obstacles for efficient execution of parallel programs. To reduce these overheads in shared-memory/shared-bus multiprocessors, we have proposed two hardware mechanisms: the Inter-Cache Snoop Control Mechanism (ICSCM), which dynamically switches snoop-protocols for improving shared-bus utilization, and the Mechanism for Integrated Synchronization and Communication (MISC), which extends ICSCM to support producer-consumer type synchronization efficiently. We have developed an execution-driven multiprocessor simulator for evaluating performance with these mechanisms. Simulation experiments on doacross loops show remarkable speed-ups by ICSCM/MISC mechanisms. Although the proposed mechanisms are originally implemented on a single shared-bus system, they are easily applicable to a clustered multiprocessing systems. The methods used in a clustered system are discussed.