Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Enhancing operating system support for multicore processors by using hardware performance monitoring
ACM SIGOPS Operating Systems Review
Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Addressing shared resource contention in multicore processors via scheduling
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
A case for NUMA-aware contention management on multicore systems
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the sixth conference on Computer systems
Characterizing multi-threaded applications based on shared-resource contention
ISPASS '11 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
Page coloring synchronization for improving cache performance in virtualization environment
ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part III
Hi-index | 0.00 |
Non-uniform memory architecture (NUMA) system has numerous nodes with shared last level cache (LLC). Their shared LLC has brought many benefits in the cache utilization. However, LLC can be seriously polluted by tasks that cause huge I/O traffic for a long time since inclusive cache architecture of LLC replaces valid cache line by back-invalidate. Many research on the page coloring, partitioning, and pollute buffer mechanism handled this cache pollution. But, there are no scheduling approaches considering I/O-intensive tasks in NUMA systems. To address the above problem, OS scheduling that reduces cache pollution is highly needed in NUMA systems. In this paper, we propose a software-based mechanism that reduces shared LLC miss in NUMA systems. Our mechanism includes I/O traffic measurement and devil conscious scheduling. The experimental results show that LLC miss rate can be reduced up to 37.6%, and our approach improves execution time to 1.48%.