Computer
ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Architectural tradeoffs in the design of MIPS-X
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The hardware architecture of the CRISP microprocessor
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Pipelining and performance in the VAX 8800 processor
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Performance of the VAX-11/780 translation buffer: simulation and measurement
ACM Transactions on Computer Systems (TOCS)
ACM Computing Surveys (CSUR)
Cache Performance in the VAX-11/780
ACM Transactions on Computer Systems (TOCS)
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Cache memories for PDP-11 family computers
ISCA '76 Proceedings of the 3rd annual symposium on Computer architecture
A Case for Direct-Mapped Caches
Computer
Multi-level shared caching techniques for scalability in VMP-M/C
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Characteristics of performance-optimal multi-level cache hierarchies
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Inexpensive implementations of set-associativity
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Organization and performance of a two-level virtual-real cache hierarchy
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Implementing a cache for a high-performance GaAs microprocessor
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Secondary cache performance in RISC architecture
ACM SIGARCH Computer Architecture News
Evaluating performance of prefetching second level caches
ACM SIGMETRICS Performance Evaluation Review
A unified architectural tradeoff methodology
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Optimal allocation of on-chip memory for multiple-API operating systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A trace-driven simulation methodology
ACM SIGARCH Computer Architecture News
Instruction fetching: coping with code bloat
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A Subsystem-Oriented Performance Analysis Methodology for Shared-Bus Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Architecture Technique Trade-Offs Using Mean Memory Delay Time
IEEE Transactions on Computers
The selection of optimal cache lines for microprocessor-based controllers
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
CPU Cache Prefetching: Timing Evaluation of Hardware Implementations
IEEE Transactions on Computers
Trace-driven simulations for a two-level cache design in open bus systems
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The TLB slice—a low-cost high-speed address translation mechanism
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
On multi-level exclusive caching: offline optimality and why promotions are better than demotions
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Hi-index | 0.01 |
We report on a trace-driven simulation study to examine the effect of a two-level cache hierarchy in uniprocessors. A simulation model of a multiple-cycle-per-instruction processor was constructed to estimate the total cycles required to execute a synthetic benchmark. Results show that a second-level cache can be used to increase system performance when main memory access times are large relative to CPU cycle time. For example, the addition of a 4-cycle, 64K second-level cache following a 1-cycle, 8K first-level cache increases performance by 15 percent when used in a system with a 15-cycle primary memory. Second level caches are shown to be particularly effective when used behind small on-chip caches; adding an 8K second-level to a 1K first-level increases performance by 26 percent, assuming similar parameters. We also evaluate the performance impact of different write strategies and separate I and D caches.