A study of single-chip processor/cache organizations for large numbers of transistors

Authors:
M. Farrens;G. Tyson;A. R. Pleszkun
Affiliations:
Computer Science Department, University of Califonia, Davis, Davis, CA;Computer Science Department, University of Califonia, Davis, Davis, CA;Department of Electrical and Computer Engineering, University of Colorado-Boulder, Boulder, CO
Venue:
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Year:
1994

Citing 5
Cited 10

Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems

IEEE Transactions on Computers
Two-level adaptive training branch prediction

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Alpha AXP architecture

Communications of the ACM
Cache Memories

ACM Computing Surveys (CSUR)
Aspects of cache memory and instruction buffer performance

Aspects of cache memory and instruction buffer performance

CAT—caching address tags: a technique for reducing area cost of on-chip caches

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Increasing cache port efficiency for dynamic superscalar microprocessors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The influence of caches on the performance of heaps

Journal of Experimental Algorithmics (JEA)
Multithreading with Distributed Functional Units

IEEE Transactions on Computers
Designing high bandwidth on-chip caches

Proceedings of the 24th annual international symposium on Computer architecture
Minimizing Area Cost of On-Chip Cache Memories by Caching Address Tags

IEEE Transactions on Computers
Systematic objective-driven computer architecture optimization

ARVLSI '95 Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI'95)
Exploring Microprocessor Architectures for Gigascale Integration

ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
Modeling technology impact on cluster microprocessor performance

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
The STAMPede approach to thread-level speculation

ACM Transactions on Computer Systems (TOCS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents a trace-driven simulation-based study of a wide range of cache configurations and processor counts. This study was undertaken in an attempt to help answer the question of how best to allocate large numbers of transistors, a question that is rapidly increasing in importance as transistor densities continue to climb. At what point does continuing to increase the size of the on-chip first level cache cease to provide sufficient increases in hit rate and become prohibitively difficult to access in a single cycle? In order to compare different configurations, the concept of an Equivalent Cache Transistor is presented. Results indicate that the access time of the first-level data cache is more important than the size. In addition, it appears that once approximately 15 million transistors become available, a two processor configuration is preferable to a single processor with correspondingly larger caches.