Performance evaluation of on-chip register and cache organizations

Authors:
R. J. Eickenmeyer;J. H. Patel
Affiliations:
Univ. of Illinois, Urbana;Univ. of Illinois, Urbana
Venue:
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Year:
1988

Citing 15
Cited 7

Principles of CMOS VLSI design: a systems perspective

Principles of CMOS VLSI design: a systems perspective
How not to lie with statistics: the correct way to summarize benchmark results

Communications of the ACM - The MIT Press scientific computation series
On the use of registers vs. cache to minimize memory traffic

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
And Now a Case for More Complex Instruction Sets

Computer
Performance evaluation of multiple register sets

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
A performance analysis of automatically managed top of stack buffers

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Performance evaluation of multiple register set architectures and cache memories

Performance evaluation of multiple register set architectures and cache memories
Computer programming and architecture: The VAX

Computer programming and architecture: The VAX
Performance of the VAX-11/780 translation buffer: simulation and measurement

ACM Transactions on Computer Systems (TOCS)
High-speed top-of-stack scheme for VLSI processor: a management algorithm and its analysis

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Analyzing multiple register sets

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Dhrystone: a synthetic systems programming benchmark

Communications of the ACM
A reduced register file for RISC architectures

ACM SIGARCH Computer Architecture News
Register allocation for free: The C machine stack cache

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Experimental evaluation of on-chip microprocessor cache memories

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture

Achieving high instruction cache performance with an optimizing compiler

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Classification and performance evaluation of instruction buffering techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Memory latency effects in decoupled architectures with a single data memory module

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Optimal allocation of on-chip memory for multiple-API operating systems

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The selection of optimal cache lines for microprocessor-based controllers

MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
The Effect of Code Expanding Optimizations on Instruction Cache Design

IEEE Transactions on Computers
Memory Latency Effects in Decoupled Architectures

IEEE Transactions on Computers

Quantified Score

Hi-index	0.01

Visualization

Abstract

Chip area is a critical resource in the design of VLSI processors. There are many different alternative designs that could fill this chip area. This paper compares several different local memory organizations applicable for single-chip processors. Several cache types—instruction, data, split, unified, stack, top-of-stack—are considered. These are compared to multiple register set architectures to which various caches can also be added. The performance metric of interest is effective access time, since a wide variety of register and cache organizations are used. A model for access time and a model for chip area required for each organization form the basis for comparison. Extensive simulations of several register-memory organizations are presented. Address traces from a VAX-11/780 running systems programs were used in the simulation.