Configurable emulated shared memory architecture for general purpose MP-SOCs and NOC regions

Authors:
Martti Forsell
Affiliations:
VTT, Platform Architectures, Box 1100, FI-90571 Oulu, Finland
Venue:
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Year:
2009

Citing 14
Cited 2

A bridging model for parallel computation

Communications of the ACM
How to emulate shared memory

Journal of Computer and System Sciences
An introduction to parallel algorithms

An introduction to parallel algorithms
Dynamic Perfect Hashing: Upper and Lower Bounds

SIAM Journal on Computing
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
Ultracomputers

ACM Transactions on Programming Languages and Systems (TOPLAS)
Practical Pram Programming

Practical Pram Programming
Architectural differences of efficient sequential and parallel computers

Journal of Systems Architecture: the EUROMICRO Journal
A Scalable High-Performance Computing Solution for Networks on Chips

IEEE Micro
Very Long Instruction Word architectures and the ELI-512

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Parallelism in random access machines

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Computer Architecture: A Quantitative Approach

Computer Architecture: A Quantitative Approach
Networks on chip

Networks on chip
Maximizing throughput over parallel wire structures in the deep submicrometer regime

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Preliminary analysis of feasible benchmark problems for the hydrid PRAM/NUMA REPLICA architecture

Proceedings of the 13th International Conference on Computer Systems and Technologies
Towards a parallel debugging framework for the massively multi-threaded, step-synchronous REPLICA architecture

Proceedings of the 14th International Conference on Computer Systems and Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Emulated shared memory (ESM) multiprocessor systems on chip (MP-SOC) and network on chip (NOC) regions are efficient general purpose computing engines for future computers and embedded systems running applications unknown at the design phase. While they provide programmer a synchronous, unified, and constant time accessible shared memory, the existing ESM architectures have been shown to be inefficient with workloads having low parallelism. In this paper we outline a configurable emulated shared memory (CESM) architecture that retains the advantages of the ESM architectures for parallel enough code but is also able to execute applications with low parallelism efficiently. This happens by allowing multiple threads to join as a single nonuniform memory access (NUMA) bunch and organizing memory system to support NUMA-like behavior for thread-local data if parallelism is limited. Performance simulations as well as silicon area and power consumption estimations of CESM MP-SOC/ NOC regions are provided.