Design and evaluation of dynamic access ordering hardware

Authors:
Sally A. McKee;Assaji Aluwihare;Benjamin H. Clark;Robert H. Klenke;Trevor C. Landon;Christopher W. Oliver;Maximo H. Salinas;Adam E. Szymkowiak;Kenneth L. Wright;Wm. A. Wulf;James H. Aylor
Affiliations:
University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA
Venue:
ICS '96 Proceedings of the 10th international conference on Supercomputing
Year:
1996

Citing 10
Cited 12

The ZS-1 central processor

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
A set of level 3 basic linear algebra subprograms

ACM Transactions on Mathematical Software (TOMS)
Code generation for streaming: an access/execute mechanism

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Evaluation of the WM architecture

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Evaluating stream buffers as a secondary cache replacement

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Access ordering and effective memory bandwidth

Access ordering and effective memory bandwidth
Maximizing memory bandwidth for streamed computations

Maximizing memory bandwidth for streamed computations
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
PIPE: a VLSI decoupled architecture

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Access ordering and memory-conscious cache utilization

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture

A performance comparison of contemporary DRAM architectures

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Hardware-only stream prefetching and dynamic access ordering

Proceedings of the 14th international conference on Supercomputing
Algorithmic foundations for a parallel vector access memory system

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Concurrency, latency, or system overhead: which has the largest impact on uniprocessor DRAM-system performance?

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
High-Performance DRAMs in Workstation Environments

IEEE Transactions on Computers
Smarter Memory: Improving Bandwidth for Streamed References

Computer
Performance of the Complex Streamed Instruction Set on Image Processing Kernels

Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Efficient orchestration of sub-word parallelism in media processors

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Efficient address remapping in distributed shared-memory systems

ACM Transactions on Architecture and Code Optimization (TACO)
ALP: Efficient support for all levels of parallelism for complex media applications

ACM Transactions on Architecture and Code Optimization (TACO)
Impulse: Memory system support for scientific applications

Scientific Programming
Scalable barrier synchronisation for large-scale shared-memory multiprocessors

International Journal of High Performance Computing and Networking

Quantified Score

Hi-index	0.00

Design and evaluation of dynamic access ordering hardware

Quantified Score

Visualization

Abstract