Principles of CMOS VLSI design: a systems perspective
Principles of CMOS VLSI design: a systems perspective
A Case for Direct-Mapped Caches
Computer
The effect of sharing on the cache and bus performance of parallel programs
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Analysis and Comparison of Cache Coherence Protocols for a Packet-Switched Multiprocessor
IEEE Transactions on Computers
Data prefetching in multiprocessor vector cache memories
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Dynamic base register caching: a technique for reducing address bus width
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Adjustable block size coherent caches
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Introducing a New Cache Design into Vector Computers
IEEE Transactions on Computers
A study of single-chip processor/cache organizations for large numbers of transistors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Optimal allocation of on-chip memory for multiple-API operating systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Decoupled sectored caches: conciliating low tag implementation cost
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Introduction to VLSI Systems
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Information content of CPU memory referencing behavior
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Watermarking techniques for intellectual property protection
DAC '98 Proceedings of the 35th annual Design Automation Conference
The pool of subsectors cache design
ICS '99 Proceedings of the 13th international conference on Supercomputing
Two-Level Address Storage and Address Prediction (Research Note)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Hi-index | 14.98 |
This paper presents a technique for minimizing chip-area cost of implementing an on-chip cache memory of microprocessors. The main idea of the technique is Caching Address Tags, or CAT cache, for short. The CAT cache exploits locality property that exists among addresses of memory references. By keeping only a limited number of distinct tags of cached data, rather than having as many tags as cache lines, the CAT cache can reduce the cost of implementing tag memory by an order of magnitude without noticeable performance difference from ordinary caches. Therefore, CAT represents another level of caching for cache memories. Simulation experiments are carried out to evaluate performance of CAT cache as compared to existing caches. Performance results of SPEC92 programs show that the CAT cache, with only a few tag entries, performs as well as ordinary caches, while chip-area saving is significant. Such area saving will increase as the address space of a processor increases. By allocating the saved chip-area for larger cache capacity, or more powerful functional units, CAT is expected to have a great impact on overall system performance.