Cache design of a sub-micron CMOS system/370
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Cache Operations by MRU Change
IEEE Transactions on Computers
Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
A Case for Direct-Mapped Caches
Computer
Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Inexpensive implementations of set-associativity
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Achieving high instruction cache performance with an optimizing compiler
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Branch history table prediction of moving target branches due to subroutine returns
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
A case for two-way skewed-associative caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Skewed associativity enhances performance predictability
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Eliminating cache conflict misses through XOR-based placement functions
ICS '97 Proceedings of the 11th international conference on Supercomputing
A language for describing predictors and its application to automatic synthesis
Proceedings of the 24th annual international symposium on Computer architecture
Capturing dynamic memory reference behavior with adaptive cache topology
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Functional Implementation Techniques for CPU Cache Memories
IEEE Transactions on Computers - Special issue on cache memory and related problems
Reducing cache misses using hardware and software page placement
ICS '99 Proceedings of the 13th international conference on Supercomputing
Way-predicting set-associative cache for high performance and low energy consumption
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Energy-driven integrated hardware-software optimizations using SimplePower
Proceedings of the 27th annual international symposium on Computer architecture
L1 data cache decomposition for energy efficiency
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Energy-efficient instruction cache using page-based placement
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Bloom filtering cache misses for accurate data speculation and prefetching
ICS '02 Proceedings of the 16th international conference on Supercomputing
Implementing optimizations at decode time
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing set-associative cache energy via way-prediction and selective direct-mapping
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
An adaptive serial-parallel CAM architecture for low-power cache blocks
Proceedings of the 2002 international symposium on Low power electronics and design
Evaluating Integrated Hardware-Software Optimizations Using a Unified Energy Estimation Framework
IEEE Transactions on Computers
Partitioned instruction cache architecture for energy efficiency
ACM Transactions on Embedded Computing Systems (TECS)
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Just Say No: Benefits of Early Cache Miss Determination
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Highly accurate and efficient evaluation of randomising set index functions
Journal of Systems Architecture: the EUROMICRO Journal
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
Power-Aware Branch Prediction: Characterization and Design
IEEE Transactions on Computers
An energy efficient cache memory architecture for embedded systems
Proceedings of the 2004 ACM symposium on Applied computing
Location cache: a low-power L2 cache system
Proceedings of the 2004 international symposium on Low power electronics and design
Using a serial cache for energy efficient instruction fetching
Journal of Systems Architecture: the EUROMICRO Journal
Hierarchical Binary Set Partitioning in Cache Memories
The Journal of Supercomputing
A way-halting cache for low-energy high-performance systems
ACM Transactions on Architecture and Code Optimization (TACO)
IATAC: a smart predictor to turn-off L2 cache lines
ACM Transactions on Architecture and Code Optimization (TACO)
A highly configurable cache for low energy embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
The V-Way Cache: Demand Based Associativity via Global Replacement
Proceedings of the 32nd annual international symposium on Computer Architecture
Reducing latencies of pipelined cache accesses through set prediction
Proceedings of the 19th annual international conference on Supercomputing
Balanced Cache: Reducing Conflict Misses of Direct-Mapped Caches
Proceedings of the 33rd annual international symposium on Computer Architecture
A low energy cache design for multimedia applications exploiting set access locality
Journal of Systems Architecture: the EUROMICRO Journal
Dynamic feature selection for hardware prediction
Journal of Systems Architecture: the EUROMICRO Journal
Making a case for split data caches for embedded applications
MEDEA '05 Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture
Scientific applications vs. SPEC-FP: a comparison of program behavior
Proceedings of the 20th annual international conference on Supercomputing
Proceedings of the 20th annual international conference on Supercomputing
Reducing non-deterministic loads in low-power caches via early cache set resolution
Microprocessors & Microsystems
Compiler-managed partitioned data caches for low power
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Reducing cache misses through programmable decoders
ACM Transactions on Architecture and Code Optimization (TACO)
Tiny split data-caches make big performance impact for embedded applications
Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
Word-interleaved cache: an energy efficient data cache architecture
Proceedings of the 13th international symposium on Low power electronics and design
Recruiting Decay for Dynamic Power Reduction in Set-Associative Caches
Transactions on High-Performance Embedded Architectures and Compilers II
Data Cache Techniques to Save Power and Deliver High Performance in Embedded Systems
Transactions on High-Performance Embedded Architectures and Compilers II
Way guard: a segmented counting bloom filter approach to reducing energy for set-associative caches
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
An hybrid eDRAM/SRAM macrocell to implement first-level data caches
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive line placement with the set balancing cache
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
On reducing load/store latencies of cache accesses
Journal of Systems Architecture: the EUROMICRO Journal
Leveraging high performance data cache techniques to save power in embedded systems
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Applying decay to reduce dynamic power in set-associative caches
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Customized placement for high performance embedded processor caches
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Dynamic, non-linear cache architecture for power-sensitive mobile processors
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
The ZCache: Decoupling Ways and Associativity
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Location cache design and performance analysis for chip multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Selective word reading for high performance and low power processor
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Embedded Systems Design
Soft error mitigation in cache memories of embedded systems by means of a protected scheme
LADC'05 Proceedings of the Second Latin-American conference on Dependable Computing
Reducing L1 caches power by exploiting software semantics
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
An integrated pseudo-associativity and relaxed-order approach to hardware transactional memory
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An energy-efficient L2 cache architecture using way tag information under write-through policy
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Application-aware adaptive cache architecture for power-sensitive mobile processors
ACM Transactions on Embedded Computing Systems (TECS)
TLC: a tag-less cache for reducing dynamic first level cache energy
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Virtually split cache: An efficient mechanism to distribute instructions and data
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.01 |
In this paper we propose a cache design that provides the same miss rate as a two-way set associative cache, but with an access time closer to a direct-mapped cache. As with other designs, a traditional direct-mapped cache is conceptually partitioned into multiple banks, and the blocks in each set are probed, or examined, sequentially. Other designs either probe the set in a fixed order or add extra delay in the access path for all accesses. We use prediction sources to guide the cache examination, reducing the amount of searching and thus the average access latency. A variety of accurate prediction sources are considered, with some being available in early pipeline stages. We feel that our design offers the same or better performance and is easier to implement than previous designs.