Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
Multiprocessor cache analysis using ATUM
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A Case for Direct-Mapped Caches
Computer
Translation lookaside buffer consistency: a software approach
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Organization and performance of a two-level virtual-real cache hierarchy
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
MIPS RISC architectures
Alpha architecture reference manual
Alpha architecture reference manual
Page placement algorithms for large real-indexed caches
ACM Transactions on Computer Systems (TOCS)
Eliminating the address translation bottleneck for physical address cache
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The effect of page allocation on caches
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A case for two-way skewed-associative caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Protection traps and alternatives for memory management of an object-oriented language
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Avoiding conflict misses dynamically in large direct-mapped caches
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The Rio file cache: surviving operating system crashes
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The design and performance of a conflict-avoiding cache
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Randomized Cache Placement for Eliminating Conflicts
IEEE Transactions on Computers - Special issue on cache memory and related problems
The TLB slice—a low-cost high-speed address translation mechanism
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Multi-column implementations for cache associativity
ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
Page allocation to reduce access time of physical caches
Page allocation to reduce access time of physical caches
Code Placement with Selective Cache Activity Minimization for Embedded Real-time Software Design
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Enabling software management for multicore caches with a lightweight hardware support
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Micro-pages: increasing DRAM efficiency with locality-aware data placement
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Handling the problems and opportunities posed by multiple on-chip memory controllers
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Cache index-aware memory allocation
Proceedings of the international symposium on Memory management
Hi-index | 14.98 |
Modern CPUs often use large physically indexed caches that are direct-mapped or have low associativities. Such caches do not interact well with virtual memory systems. An improperly placed physical page will end up in a wrong place in the cache, causing excessive conflicts with other cached pages. Page coloring has been proposed to reduce the conflict misses by carefully placing pages in the physical memory. While page coloring works well for some applications, many factors limit its performance. Page coloring limits the freedom of the page placement system and may increase swapping traffic. In this paper, we propose a novel and simple architecture, called color-indexed, physically tagged caches, which can significantly reduce the conflict misses. With some simple modifications to the TLB (Translation Look-aside Buffer), the new architecture decouples the addresses of the cache from the addresses of the main memory. Since the cache addresses do not depend on the the physical memory addresses anymore, the system can freely place data in any cache page to minimize the conflict misses, without affecting the paging system. Extensive trace-driven simulation results show that our design performs much better than traditional page coloring techniques. The new scheme enables a direct-mapped cache to achieve hit ratios very close to or better than those of a two-way set associative cache. Moreover, the architecture does not increase cache access latency, which is a drawback of set associative caches. The hardware overhead is minimal. We show that our scheme can reduce the cache size by 50 percent without sacrificing performance. A two-way set-associative cache that uses this strategy can perform very close to a fully associative cache.