ACM Transactions on Computer Systems (TOCS)
On the inclusion properties for multi-level cache hierarchies
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
ACM Transactions on Computer Systems (TOCS)
Cache miss equations: a compiler framework for analyzing and tuning memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Analytical cache models with applications to cache partitioning
ICS '01 Proceedings of the 15th international conference on Supercomputing
Exact analysis of the cache behavior of nested loops
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Automatic Code Mapping on an Intelligent Memory Architecture
IEEE Transactions on Computers
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Compile-Time Based Performance Prediction
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Automatic Analytical Modeling for the Estimation of Cache Misses
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Automatically Mapping Code on an Intelligent Memory Architecture
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Let's Study Whole-Program Cache Behaviour Analytically
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Using Prime Numbers for Cache Indexing to Eliminate Conflict Misses
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Evaluation techniques for storage hierarchies
IBM Systems Journal
Hi-index | 0.01 |
The need to provide performance guarantee in high performance servers has long been neglected. Providing performance guarantee in current and future servers is difficult because fine-grain resources, such as on-chip caches, are shared by multiple processors or thread contexts. Although inter-thread cache sharing generally improves the overall throughput of the system, the impact of cache contention on the threads that share it is highly non-uniform: some threads may be slowed down significantly, while others are not. This may cause severe performance problems such as sub-optimal throughput, cache thrashing, and thread starvation for threads that fail to occupy sufficient cache space to make good progress. Clearly, this situation is not desirable when performance guarantee needs to be provided, such as in utility computing servers. Unfortunately, there is no existing model that allows extensive investigation of the impact of cache sharing. To allow such a study, we propose an inductive probability model to predict the impact of cache sharing on co-scheduled threads. The input to the model is the isolated L2 circular sequence profile of each thread, which can be easily obtained on-line or off-line. The output of the model is the number of extra L2 cache misses for each thread due to cache sharing. We validate the model against a cycle-accurate simulation that implements a dual-core Chip Multi-Processor (CMP architecture), on fourteen pairs of mostly SPEC benchmarks. The model achieves an average error of only 3.9%.