An Analytical Model for Designing Memory Hierarchies

Authors:
Bruce L. Jacob;Peter M. Chen;Seth R. Silverman;Trevor N. Mudge
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Computers
Year:
1996

Citing 9
Cited 17

Disk cache—miss ratio analysis and design considerations

ACM Transactions on Computer Systems (TOCS)
Performance through memory

SIGMETRICS '87 Proceedings of the 1987 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Cache and memory hierarchy design: a performance-directed approach

Cache and memory hierarchy design: a performance-directed approach
Mass storage technologies

Mass storage technologies
A cached WORM file system

Software—Practice & Experience
Tradeoffs in two-level on-chip caching

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Cache Memories

ACM Computing Surveys (CSUR)
Calculus and Analytic Geometry

Calculus and Analytic Geometry
Operating Systems Theory

Operating Systems Theory

Strategic directions in computer architecture

ACM Computing Surveys (CSUR) - Special ACM 50th-anniversary issue: strategic directions in computing research
Formalized methodology for data reuse exploration in hierarchical memory mappings

ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Automatic and efficient evaluation of memory hierarchies for embedded systems

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Memory Hierarchy Considerations for Cost-Effective Cluster Computing

IEEE Transactions on Computers
Systematic data reuse exploration methodology for irregular access patterns

ISSS '00 Proceedings of the 13th international symposium on System synthesis
Using locality surfaces to characterize the SPECint 2000 benchmark suite

Workload characterization of emerging computer applications
A Model of a Microprocessor with a Wide Command Word

Cybernetics and Systems Analysis
Search space definition and exploration for nonuniform data reuse opportunities in data-dominant applications

ACM Transactions on Design Automation of Electronic Systems (TODAES)
A Comment on "An Analytical Model for Designing Memory Hierarchies"

IEEE Transactions on Computers
Probabilistic Miss Equations: Evaluating Memory Hierarchy Performance

IEEE Transactions on Computers
Set Associative Cache Behavior Optimization

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Scalar Metric for Temporal Locality and Estimation of Cache Performance

Proceedings of the conference on Design, automation and test in Europe - Volume 1
Fast, accurate design space exploration of embedded systems memory configurations

Proceedings of the 2007 ACM symposium on Applied computing
PAM: a novel performance/power aware meta-scheduler for multi-core systems

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs

ACM Transactions on Architecture and Code Optimization (TACO)
Exploring latency-power tradeoffs in deep nonvolatile memory hierarchies

Proceedings of the 9th conference on Computing Frontiers

Quantified Score

Hi-index	14.99

Visualization

Abstract

Memory hierarchies have long been studied by many means: system building, trace-driven simulation, and mathematical analysis. Yet little help is available for the system designer wishing to quickly size the different levels in a memory hierarchy to a first-order approximation. In this paper, we present a simple analysis for providing this practical help and some unexpected results and intuition that come out of the analysis. By applying a specific, parameterized model of workload locality, we are able to derive a closed-form solution for the optimal size of each hierarchy level. We verify the accuracy of this solution against exhaustive simulation with two case studies: a three-level I/O storage hierarchy and a three-level processor-cache hierarchy. In all but one case, the configuration recommended by the model performs within 5% of optimal. One result of our analysis is that the first place to spend money is the cheapest (rather than the fastest) cache level, particularly with small system budgets. Another is that money spent on an n-level hierarchy is spent in a fixed proportion until another level is added.