Quantitative system performance: computer system analysis using queueing network models
Quantitative system performance: computer system analysis using queueing network models
ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Software-controlled caches in the VMP multiprocessor
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
Principles of Optimal Page Replacement
Journal of the ACM (JACM)
Performance Analysis of Cache Memories
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
Cache Performance in the VAX-11/780
ACM Transactions on Computer Systems (TOCS)
Transient behavior of cache memories
ACM Transactions on Computer Systems (TOCS)
Cold-start vs. warm-start miss ratios
Communications of the ACM
A simple linear model of demand paging performance
Communications of the ACM
The working set model for program behavior
Communications of the ACM
Program Behavior: Models and Measurements
Program Behavior: Models and Measurements
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A study of instruction cache organizations and replacement policies
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Experimental evaluation of on-chip microprocessor cache memories
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Cache hit ratios with geometric task switch intervals
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Trace compaction using cache filtering with blocking
Trace compaction using cache filtering with blocking
Sequential prefetch strategies for instructions and data
Sequential prefetch strategies for instructions and data
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Analysis of multithreaded architectures for parallel computing
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
The effect of context switches on cache performance
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Evaluating Design Choices for Shared Bus Multiprocessors in a Throughput-Oriented Environment
IEEE Transactions on Computers
Page placement algorithms for large real-indexed caches
ACM Transactions on Computer Systems (TOCS)
A Model of Workloads and its Use in Miss-Rate Prediction for Fully Associative Caches
IEEE Transactions on Computers
ACM SIGMETRICS Performance Evaluation Review
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Analyzing multiprocessor cache behavior through data reference modeling
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Expected I-cache miss rates via the gap model
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The influence of caches on the performance of heaps
Journal of Experimental Algorithmics (JEA)
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Trace-driven memory simulation: a survey
ACM Computing Surveys (CSUR)
Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
Combining Trace Sampling with Single Pass Methods for Efficient Cache Simulation
IEEE Transactions on Computers
Performance counters and state sharing annotations: a unified approach to thread locality
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache-conscious structure layout
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Cache performance analysis of traversals and random accesses
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Automatic and efficient evaluation of memory hierarchies for embedded systems
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Analytical Modeling of Set-Associative Cache Behavior
IEEE Transactions on Computers
Architectural requirements and scalability of the NAS parallel benchmarks
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Towards a theory of cache-efficient algorithms
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
An analytical model of the working-set sizes in decision-support systems
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The effect of seance communication on multiprocessing systems
ACM Transactions on Computer Systems (TOCS)
Analytical cache models with applications to cache partitioning
ICS '01 Proceedings of the 15th international conference on Supercomputing
Exact analysis of the cache behavior of nested loops
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Performance prediction for random write reductions: a case study in modeling shared memory programs
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Towards a theory of cache-efficient algorithms
Journal of the ACM (JACM)
Modeling Live and Dead Lines in Cache Memory Systems
IEEE Transactions on Computers
Performance Implications of Tolerating Cache Faults
IEEE Transactions on Computers
Performance Tradeoffs in Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling
IEEE Transactions on Parallel and Distributed Systems
Analytic Evaluation of Shared-Memory Architectures
IEEE Transactions on Parallel and Distributed Systems
A Blocked All-Pairs Shortest-Path Algorithm
SWAT '00 Proceedings of the 7th Scandinavian Workshop on Algorithm Theory
Trace-Driven Memory Simulation: A Survey
Performance Evaluation: Origins and Directions
On cache memory hierarchy for Chip-Multiprocessor
ACM SIGARCH Computer Architecture News
The impact of extrinsic cache performance on predictability of real-time systems
RTCSA '95 Proceedings of the 2nd International Workshop on Real-Time Computing Systems and Applications
Efficient trace-sampling simulation techniques for cache performance analysis
SS '96 Proceedings of the 29th Annual Simulation Symposium (SS '96)
Highly accurate and efficient evaluation of randomising set index functions
Journal of Systems Architecture: the EUROMICRO Journal
A self-tuning cache architecture for embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
A blocked all-pairs shortest-paths algorithm
Journal of Experimental Algorithmics (JEA)
High level cache simulation for heterogeneous multiprocessors
Proceedings of the 41st annual Design Automation Conference
Measuring the cache interference cost in preemptive real-time systems
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Adaptive code unloading for resource-constrained JVMs
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Effectively sharing a cache among threads
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Balancing design options with Sherpa
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Predicting Cache Space Contention in Utility Computing Servers
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Comprehensive multiprocessor cache miss rate generation using multivariate models
ACM Transactions on Computer Systems (TOCS)
Architecture based analysis of performance, reliability and security of software systems
Proceedings of the 5th international workshop on Software and performance
A methodology for detailed performance modeling of reduction computations on SMP machines
Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
Analysis of scratch-pad and data-cache performance using statistical methods
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
An analytical model for cache replacement policy performance
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
A page fault equation for modeling the effect of memory size
Performance Evaluation
SMA: a self-monitored adaptive cache warm-up scheme for microprocessor simulation
International Journal of Parallel Programming
Comprehensive multivariate extrapolation modeling of multiprocessor cache miss rates
ACM Transactions on Computer Systems (TOCS)
Determining output uncertainty of computer system models
Performance Evaluation
Quantifying software performance, reliability and security: An architecture-based approach
Journal of Systems and Software
CMP cache performance projection: accessibility vs. capacity
ACM SIGARCH Computer Architecture News
Scheduling threads for constructive cache sharing on CMPs
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Ubiquitous memory introspection
Proceedings of the International Symposium on Code Generation and Optimization
Estimating the output cardinality of partial preaggregation with a measure of clusteredness
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Accurate memory signatures and synthetic address traces for HPC applications
Proceedings of the 22nd annual international conference on Supercomputing
Characterizing and modeling the behavior of context switch misses
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Resource sharing in performance models
EPEW'07 Proceedings of the 4th European performance engineering conference on Formal methods and stochastic models for performance evaluation
Understanding the behavior and implications of context switch misses
ACM Transactions on Architecture and Code Optimization (TACO)
An efficient simulation algorithm for cache of random replacement policy
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
ACM Transactions on Architecture and Code Optimization (TACO)
Performance modeling for systematic performance tuning
State of the Practice Reports
PSnAP: accurate synthetic address streams through memory profiles
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
System level multi-bank main memory configuration for energy reduction
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Toward predictable performance in software packet-processing platforms
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3_rmd
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Modeling communication in cache-coherent SMP systems: a case-study with Xeon Phi
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Reuse-based online models for caches
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
TSV: A novel energy efficient Memory Integrity Verification scheme for embedded systems
Journal of Systems Architecture: the EUROMICRO Journal
Modeling the impact of permanent faults in caches
ACM Transactions on Architecture and Code Optimization (TACO)
Ubik: efficient cache sharing with strict qos for latency-critical workloads
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
On modeling contention for shared caches in multi-core processors with techniques from ecology
Natural Computing: an international journal
Computer performance analysis and the Pi Theorem
Computer Science - Research and Development
Hi-index | 0.02 |
Trace-driven simulation and hardware measurement are the techniques most often used to obtain accurate performance figures for caches. The former requires a large amount of simulation time to evaluate each cache configuration while the latter is restricted to measurements of existing caches. An analytical cache model that uses parameters extracted from address traces of programs can efficiently provide estimates of cache performance and show the effects of varying cache parameters. By representing the factors that affect cache performance, we develop an analytical model that gives miss rates for a given trace as a function of cache size, degree of associativity, block size, subblock size, multiprogramming level, task switch interval, and observation interval. The predicted values closely approximate the results of trace-driven simulations, while requiring only a small fraction of the computation cost.