Implementing Precise Interrupts in Pipelined Processors
IEEE Transactions on Computers
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Architecture and design of AlphaServer GS320
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
IEEE Transactions on Computers
Fingerprinting: bounding soft-error detection latency and bandwidth
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Temporal Streaming of Shared Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Mechanisms for store-wait-free multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
BulkSC: bulk enforcement of sequential consistency
Proceedings of the 34th annual international symposium on Computer architecture
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Optimistic concurrency for clusters via speculative locking
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
InvisiFence: performance-transparent memory ordering in conventional multiprocessors
Proceedings of the 36th annual international symposium on Computer architecture
Journal of Parallel and Distributed Computing
Efficient sequential consistency using conditional fences
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A case for an SC-preserving compiler
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Efficient sequential consistency via conflict ordering
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
End-to-end sequential consistency
Proceedings of the 39th Annual International Symposium on Computer Architecture
Proceedings of the 27th international ACM conference on International conference on supercomputing
The case of using multiple streams in streaming
International Journal of Automation and Computing
Hi-index | 0.00 |
This paper proposes SC++lite, a sequentially-consistent system that relaxes memory order speculatively to bridge the performance gap among memory consistency models. Prior proposals to speculatively relax memory order require large custom on-chip storage to maintain a history of speculative processor and memory state while memory order is relaxed. SC++lite uses the memory hierarchy to store the speculative history, providing a scalable path for speculative SC systems across a wide range of applications and system latencies. We use cycle-accurate simulation of shared-memory multiprocessors to show that SC++lite can fully relax memory order while virtually obviating the need for custom on-chip storage. Moreover,while demand for storage increases significantly with larger memory latencies, SC++lite's ability to relax memory order remains insensitive to memory latency. An SC++lite system can improve performance over a base SC system by 28% with only 2KB of custom storage in a system with 16 processors. In contrast, speculative SC systems with custom storage require 51KB of storage to improve performance by 31% over a base SC system.