The Characterization of Data Intensive Memory Workloads on Distributed PIM Systems

Authors:
Richard C. Murphy;Peter M. Kogge;Arun Rodrigues
Affiliations:
-;-;-
Venue:
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Year:
2000

Citing 11
Cited 0

The 007 Benchmark

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Multipole translation theory for the three-dimensional Laplace and Helmholtz equations

SIAM Journal on Scientific Computing
A design analysis of a hybrid technology multithreaded architecture for petaflops scale computation3

ICS '99 Proceedings of the 13th international conference on Supercomputing
Microservers: a new memory semantics for massively parallel computing

ICS '99 Proceedings of the 13th international conference on Supercomputing
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Smart Memories: a modular reconfigurable architecture

Proceedings of the 27th annual international symposium on Computer architecture
Molecular Dynamics Simulation: Elementary Methods

Molecular Dynamics Simulation: Elementary Methods
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
A Case for Intelligent RAM

IEEE Micro
High-Concurrency Locking in R-Trees

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
PIM Architectures to Support Petaflops Level Computation in the HTMT Machine

IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Processing-In-Memory (PIM) circumvents the von Neumann bottleneck by combining logic and memory (typically DRAM) on a single die. This work examines the memory system parameters for constructing PIM based parallel computers which are capable of meeting the memory access demands of complex programs that exhibit low reuse and non uniform stride accesses. The analysis uses the Data Intensive Systems (DIS) benchmark suite to examine these demanding memory access patterns. The characteristics of such applications are discussed in detail. Simulations demonstrate that PIMs are capable of supporting enough data to be multicomputer nodes. Additionally, the results show that even data intensive code exhibits a large amount of internal spatial locality. A mobile thread execution model is presented that takes advantage of the tremendous amount of internal bandwidth available on a given PIM node and the locality exhibited by the application.