Automatically Mapping Code on an Intelligent Memory Architecture

Authors:
Jaeji Lee
Affiliations:
-
Venue:
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Year:
2001

Citing 0
Cited 13

Automatic Code Mapping on an Intelligent Memory Architecture

IEEE Transactions on Computers
Adaptively Mapping Code in an Intelligent Memory Architecture

IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Correlation Prefetching with a User-Level Memory Thread

IEEE Transactions on Parallel and Distributed Systems
Data forwarding through in-memory precomputation threads

Proceedings of the 18th annual international conference on Supercomputing
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Predicting Cache Space Contention in Utility Computing Servers

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Adaptive execution techniques for SMT multiprocessor architectures

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Counter-Based Cache Replacement Algorithms

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Where replacement algorithms fail: a thorough analysis

Proceedings of the 7th ACM international conference on Computing frontiers
Helper thread prefetching for loosely-coupled multiprocessor systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Compile-Time thread distinguishment algorithm on VIM-Based architecture

ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Adaptively increasing performance and scalability of automatically parallelized programs

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
A new perspective on processing-in-memory architecture design

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a host processor and a simpler memory processor. To achieve high performance with this type of architecture,code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we show average speedups of 1.7 for numerical applications and 1.2 for non-numerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on intelligent memory systems.