Principles of CMOS VLSI design: a systems perspective
Principles of CMOS VLSI design: a systems perspective
A Case for Direct-Mapped Caches
Computer
The effect of sharing on the cache and bus performance of parallel programs
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Data prefetching in multiprocessor vector cache memories
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Dynamic base register caching: a technique for reducing address bus width
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Adjustable block size coherent caches
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A novel cache design for vector processing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A study of single-chip processor/cache organizations for large numbers of transistors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Optimal allocation of on-chip memory for multiple-API operating systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Decoupled sectored caches: conciliating low tag implementation cost
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Introduction to VLSI Systems
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Information content of CPU memory referencing behavior
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Don't use the page number, but a pointer to it
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Options for dynamic address translation in COMAs
Proceedings of the 25th annual international symposium on Computer architecture
Functional Implementation Techniques for CPU Cache Memories
IEEE Transactions on Computers - Special issue on cache memory and related problems
The pool of subsectors cache design
ICS '99 Proceedings of the 13th international conference on Supercomputing
Table size reduction for data value predictors by exploiting narrow width values
Proceedings of the 14th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
RECAST: Boosting Tag Line Buffer Coverage in Low-Power High-Level Caches "for Free"
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Replicating tag entries for reliability enhancement in cache tag arrays
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
This paper presents a technique for minimizing chip-area cost of implementing an on-chip cache memory of microprocessors. The main idea of the technique Caching Address Tags, or CAT cache for short. The CAT cache exploits locality property that exists among addresses of memory references for the purpose of minimizing chip area-cost of address tags. By keeping only a limited number of distinct tags of cached data rather than having as many tags as cache lines, the CAT cache can reduce the cost of implementing tag memory by an order of magnitude without noticeable performance difference from ordinary caches. Therefore, CAT represents another level of caching for cache memories. Simulation experiments are carried out to evaluate performance of CAT cache as compared to existing caches. Performance results of SPEC92 programs show that the CAT cache with only a few tag entries performs as well as ordinary caches while chip-area saving is significant. Such area saving will increase as the address space of a processor increases. By allocating the saved chip area for larger cache capacity, or more powerful functional units, CAT is expected to have a great impact on overall system performance.