Page placement algorithms for large real-indexed caches
ACM Transactions on Computer Systems (TOCS)
Reducing cache misses using hardware and software page placement
ICS '99 Proceedings of the 13th international conference on Supercomputing
The TLB slice—a low-cost high-speed address translation mechanism
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Database Architecture Optimized for the New Bottleneck: Memory Access
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Page allocation to reduce access time of physical caches
Page allocation to reduce access time of physical caches
Dynamic Partitioning of Shared Cache Memory
The Journal of Supercomputing
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
An analytical model for cache replacement policy performance
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Cooperative cache partitioning for chip multiprocessors
Proceedings of the 21st annual international conference on Supercomputing
Exploiting Single-Usage for Effective Memory Management
ACSAC '07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Towards practical page coloring-based multicore cache management
Proceedings of the 4th ACM European conference on Computer systems
Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Enabling software management for multicore caches with a lightweight hardware support
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
MCC-DB: minimizing cache conflicts in multi-core processors for databases
Proceedings of the VLDB Endowment
Hi-index | 0.02 |
The utilization of shared LLC(Last Level Cache) is important for efficiency of multi-core processor. Uncontrolled sharing leads to cache pollution i.e. the weak locality data(single-usage data without re-using) continuously evict the strong locality data (frequently re-used data) from LLC in both inner query processing and co-running programs. For analytical MMDB (Main-Memory Database) applications, with skewed star schema of DW, more than 95% memory capacity is occupied by memory-resident fact table with weak locality and there are only small size dimension tables with strong locality. Cache partitioning must manage data with different localities inside query processing to avoid cache pollution by weak locality fact table. The static OS-based cache partitioning suffers from insufficient memory address capacity due to large fact table and the dynamic OS-based cache partitioning also suffers from data movement overhead during cache re-allocation. In order to employ a practical and effective cache partitioning policy, we propose an application softwarebased W-order scan policy for real analytical MMDB application. The consecutive physical address based W-order policy is proposed to reduce cache misses with high memory utilization by controlling the physical page accessing order within large and consecutive physical pages. Another approach is page-color index i.e. we extract page-color bits from pages of weak locality data and sort the page address by page-color bits, when we perform a page-color index scan, we can control the physical page accessing order too without supporting from OS for large consecutive physical page allocating. We measure the L2 cache miss rate by simulating a typical hash join operation. The experimental results show that DBMSs can improve cache performance through controlling weak locality data accessing pattern by themselves oppose to depending on supports by hardware or OS.