Data Parallel Address Architecture

Authors:
Jung Ho Ahn;William J. Dally
Affiliations:
-;-
Venue:
IEEE Computer Architecture Letters
Year:
2006

Citing 0
Cited 2

The design space of data-parallel memory systems

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Optimizing stream organization to improve the performance of scientific computing applications on the stream processor

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data parallel memory systems must maintain a large number of outstanding memory references to fully use increasing DRAM bandwidth in the presence of increasing latency. At the same time, the throughput of modern DRAMs is very sensitive to access patterns due to the time required to precharge and activate banks and to switch between read and write access. To achieve memory reference parallelism a system may simultaneously issue references from multiple reference threads. Alternatively multiple references from a single thread can be issued in parallel. In this paper we examine this tradeoff and show that allowing only a single thread to access DRAM at any given time significantly improves performance by increasing the locality of the reference stream and hence reducing precharge/activate operations and read/write turnaround. Simulations of scientific and multimedia applications show that generating multiple references from a single thread gives, on average, 17% better performance than generating references from two parallel threads.