Analytical analysis of finite cache penalty and cycles per instruction of a multiprocessor memory hierarchy using miss rates and queuing theory

Authors:
R. E. Matick;T. J. Heller;M. Ignatowski
Affiliations:
IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York;IBM Server Group, Poughkeepsie, New York;IBM Server Group, Poughkeepsie, New York
Venue:
IBM Journal of Research and Development
Year:
2001

Citing 8
Cited 4

Quantitative system performance: computer system analysis using queueing network models

Quantitative system performance: computer system analysis using queueing network models
An accurate and efficient performance analysis technique for multiprocessor snooping cache-consistency protocols

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A perspective on queueing models of computer performance

Performance Evaluation
STiNG: a CC-NUMA computer system for the commercial marketplace

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Understanding some simple processor-performance limits

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Computer Performance Modeling Handbook

Computer Performance Modeling Handbook
Computer Storage Systems and Technology

Computer Storage Systems and Technology
Theory, Volume 1, Queueing Systems

Theory, Volume 1, Queueing Systems

Comparison of analytic performance models using closed mean-value analysis versus open-queuing theory for estimating cycles per instruction of memory hierarchies

IBM Journal of Research and Development
Comprehensive multiprocessor cache miss rate generation using multivariate models

ACM Transactions on Computer Systems (TOCS)
Logic-based eDRAM: origins and rationale for use

IBM Journal of Research and Development - Electrochemical technology in microelectronics
Comprehensive multivariate extrapolation modeling of multiprocessor cache miss rates

ACM Transactions on Computer Systems (TOCS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Advances in technology have provided a continuing improvement in processor speed and capacity of attached main memory. The increasing gap between main memory and processor cycle times has required increasingly more levels of caching to prevent performance degradation. The net result is that the inherent delay of a memory hierarchy associated with any computing system is becoming the major performance-determining factor and has inspired many types of analysis methods. While an accurate performance-evaluation tool requires the use of trace-driven simulators, good approximations and significant insight can be obtained by the use of analytical models to evaluate finite cache penalties based on miss rates (or miss ratios) and queuing theory combined with empirical relations between various levels of a memory hierarchy. Such tools make it possible to readily determine trends in performance vs. changes in input parameters. This paper describes such an analysis approach--one which has been implemented in a spreadsheet and used successfully to perform early engineering tradeoffs for many uniprocessor and multiprocessor memory hierarchies.