Limited Bandwidth to Affect Processor Design

Authors:
Doug Burger;James R. Goodman;Alain Kägi
Affiliations:
-;-;-
Venue:
IEEE Micro
Year:
1997

Citing 10
Cited 10

A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers

IEEE Transactions on Computers
Dynamic base register caching: a technique for reducing address bus width

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Decoupled sectored caches: conciliating low tag implementation cost

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A limit study of local memory requirements using value reuse profiles

Proceedings of the 28th annual international symposium on Microarchitecture
A modified approach to data cache management

Proceedings of the 28th annual international symposium on Microarchitecture
Memory bandwidth limitations of future microprocessors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
DataScalar architectures

Proceedings of the 24th annual international symposium on Computer architecture
Creating a wider bus using caching techniques

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture

Optimal replacements in caches with two miss costs

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Data prefetch mechanisms

ACM Computing Surveys (CSUR)
High-performance extendable instruction set computing

ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
Energy-aware design of embedded memories: A survey of technologies, architectures, and optimization techniques

ACM Transactions on Embedded Computing Systems (TECS)
Accelerating the Kernels of BLAST with an Efficient PIM (Processor-In-Memory) Architecture

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
A study of performance impact of memory controller features in multi-processor server environment

WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Distance-aware L2 cache organizations for scalable multiprocessor systems

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Reconfigurable embedded systems: Synthesis, design and application
Processor Description Languages

Processor Description Languages
Exploration of 3D stacked L2 cache design for high performance and efficient thermal control

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Performance/Thermal-Aware Design of 3D-Stacked L2 Caches for CMPs

ACM Transactions on Design Automation of Electronic Systems (TODAES)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, we discuss how the effects of long memory latencies and increased memory bandwidth requirements may affect the design of modern microprocessors and their memory systems. In particular, we examine the subtle trade-offs between memory latency and bandwidth. Through execution-driven simulation, we measure the fraction of time that several SPEC95 benchmarks spend computing, stalled for memory latency, and stalled for limited memory bandwidth. Our results show that as processors implement more aggressive latency tolerance techniques, limited memory bandwidth negatively impacts programs much more than do long memory latencies. Finally, we survey a range of strategies for mitigating bandwidth limitations and discuss the relative merits and disadvantages of each.