Evaluation of hardware write propagation support for next-generation shared virtual memory clusters
ICS '98 Proceedings of the 12th international conference on Supercomputing
Improving I/O performance with a conditional store buffer
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Adding a vector unit to a superscalar processor
ICS '99 Proceedings of the 13th international conference on Supercomputing
IEEE Transactions on Computers
Eager writeback - a technique for improving bandwidth utilization
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Accelerating shared virtual memory via general-purpose network interface support
ACM Transactions on Computer Systems (TOCS)
The effects of communication parameters on end performance of shared virtual memory clusters
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Soft error and energy consumption interactions: a data cache perspective
Proceedings of the 2004 international symposium on Low power electronics and design
Improving Memory Encryption Performance in Secure Processors
IEEE Transactions on Computers
Simple penalty-sensitive replacement policies for caches
Proceedings of the 3rd conference on Computing frontiers
Area-efficient error protection for caches
Proceedings of the conference on Design, automation and test in Europe: Proceedings
The bit-reversal SDRAM address mapping
SCOPES '05 Proceedings of the 2005 workshop on Software and compilers for embedded systems
Proceedings of the 34th annual international symposium on Computer architecture
AWOL: an adaptive write optimizations layer
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Prediction in Dynamic SDRAM Controller Policies
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Proceedings of the 20th symposium on Great lakes symposium on VLSI
On the effectiveness of speculative and selective memory fences
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Soft errors issues in low-power caches
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Hi-index | 0.01 |
Processors with write-through caches typically require a write buffer to hide the write latency to the next level of memory hierarchy and to reduce write traffic. A write buffer can cause processor stalls when it is full, when it contends with a cache miss for access to the next level of the hierarchy, and when it contains the freshest copy of data needed by a load. This paper uses instruction-level simulation of SPEC92 benchmarks to investigate how different write buffer depths, retirement policies, and load-hazard policies affect these three types of write-buffer stalls. Deeper buffers with adequate headroom, lazier retirement policies, and the ability to read data directly from the write buffer combine to substantially reduce write-buffer-induced stalls.