Evaluating stream buffers as a secondary cache replacement

  • Authors:
  • S. Palacharla;R. E. Kessler

  • Affiliations:
  • Computer Sciences Department, University of Wisconsin-Madison, Madison, WI;Cray Research, Inc., 900 Lowater Rd., Chippewa Falls, WI

  • Venue:
  • ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
  • Year:
  • 1994

Quantified Score

Hi-index 0.02

Visualization

Abstract

Today's commodity microprocessors require a low latency memory system to achieve high sustained performance. The conventional high-performance memory system provides fast data access via a large secondary cache. But large secondary caches can be expensive, particularly in large-scale parallel systems with many processors (and thus many caches).We evaluate a memory system design that can be both cost-effective as well as provide better performance, particularly for scientific workloads: a single level of (on-chip) cache backed up only by Jouppi's stream buffers [10] and a main memory. This memory system requires very little hardware compared to a large secondary cache and doesn't require modifications to commodity processors. We use trace-driven simulation of fifteen scientific applications from the NAS and PERFECT suites in our evaluation. We present two techniques to enhance the effectiveness of Jouppi's original stream buffers: filtering schemes to reduce their memory bandwidth requirement and a scheme that enables stream buffers to prefetch data being accessed in large strides. Our results show that, for the majority of our benchmarks, stream buffers can attain hit rates that are comparable to typical hit rates of secondary caches. Also, we find that as the data-set size of the scientific workload increases the performance of streams typically improves relative to secondary cache performance, showing that streams are more scalable to large data-set sizes.