Data forwarding in scalable shared-memory multiprocessors

Authors:
D. A. Koufaty;X. Chen;D. K. Poulsen;J. Torrellas
Affiliations:
Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, IL;Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, IL;Kuck and Associates, Inc., Champaign, IL and Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, IL;Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, IL
Venue:
ICS '95 Proceedings of the 9th international conference on Supercomputing
Year:
1995

Citing 12
Cited 12

The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Tolerating latency through software-controlled prefetching in shared-memory multiprocessors

Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The Stanford Dash Multiprocessor

Computer
Design and evaluation of a compiler algorithm for prefetching

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Execution-driven tools for parallel simulation of parallel architectures and applications

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Memory latency reduction via data prefetching and data forwarding in shared memory multiprocessors

Memory latency reduction via data prefetching and data forwarding in shared memory multiprocessors
Weak ordering—a new definition

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Dependence Analysis for Supercomputing

Dependence Analysis for Supercomputing
Performance Tradeoffs in Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
Update-Based Cache Coherence Protocols for Scalable Shared-Memory Multiprocessors

Update-Based Cache Coherence Protocols for Scalable Shared-Memory Multiprocessors

Temporal notions of synchronization and consistency in Beehive

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Optimizing communication in HPF programs on fine-grain distributed shared memory

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Illinois Aggressive Coma Multiprocessor project (I-ACOMA)

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Token coherence: decoupling performance and correctness

Proceedings of the 30th annual international symposium on Computer architecture
Temporal Streaming of Shared Memory

Proceedings of the 32nd annual international symposium on Computer Architecture
Store-Ordered Streaming of Shared Memory

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
CUBA: an architecture for efficient CPU/co-processor data communication

Proceedings of the 22nd annual international conference on Supercomputing
Extending CC-NUMA systems to support write update optimizations

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
An adaptive cache coherence protocol for chip multiprocessors

Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Architectural support for thread communications in multi-core processors

Parallel Computing
Concerning with on-chip network features to improve cache coherence protocols for CMPs

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Predicting Coherence Communication by Tracking Synchronization Points at Run Time

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Data forwarding in scalable shared-memory multiprocessors

Quantified Score

Visualization

Abstract