A unified model for multicore architectures

Authors:
John E. Savage;Mohammad Zubair
Affiliations:
Brown University, Providence, Rhode Island;Old Dominion University, Norfolk, Virginia
Venue:
IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
Year:
2008

Citing 24
Cited 9

A model for hierarchical memory

STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
An extended set of FORTRAN basic linear algebra subprograms

ACM Transactions on Mathematical Software (TOMS)
Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs

ACM Transactions on Mathematical Software (TOMS)
The input/output complexity of sorting and related problems

Communications of the ACM
Communication complexity of PRAMs

Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
A bridging model for parallel computation

Communications of the ACM
Optimal disk I/O with parallel block transfer

STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms

IBM Journal of Research and Development
Designing broadcasting algorithms in the Postal Model for message-passing systems

Proceedings of the 4th ACM symposium on Parallel algorithms and architectures
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
The design, implementation, and evaluation of a symmetric banded linear solver for distributed-memory parallel computers

ACM Transactions on Mathematical Software (TOMS)
GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark

ACM Transactions on Mathematical Software (TOMS)
Models of Computation: Exploring the Power of Computing

Models of Computation: Exploring the Power of Computing
The Parallel Hierarchical Memory Model

SWAT '94 Proceedings of the 4th Scandinavian Workshop on Algorithm Theory
Extending the Hong-Kung Model to Memory Hierarchies

COCOON '95 Proceedings of the First Annual International Conference on Computing and Combinatorics
Models and resource metrics for parallel and distributed computation

HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
I/O complexity: The red-blue pebble game

STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Parallelism in random access machines

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
High-performance linear algebra algorithms using new generalized data structures for matrices

IBM Journal of Research and Development
Communication lower bounds for distributed-memory matrix multiplication

Journal of Parallel and Distributed Computing
Synergistic Processing in Cell's Multicore Architecture

IEEE Micro
Anatomy of high-performance matrix multiplication

ACM Transactions on Mathematical Software (TOMS)
Hierarchical memory with block transfer

SFCS '87 Proceedings of the 28th Annual Symposium on Foundations of Computer Science

Parallel shared memory strategies for ant-based optimization algorithms

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Cache-optimal algorithms for option pricing

ACM Transactions on Mathematical Software (TOMS)
Evaluating multicore algorithms on the unified memory model

Scientific Programming - Software Development for Multi-core Computing Systems
Low depth cache-oblivious algorithms

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A bridging model for multi-core computing

Journal of Computer and System Sciences
Algorithm engineering: bridging the gap between algorithm theory and practice

Algorithm engineering: bridging the gap between algorithm theory and practice
Multi-DaC programming model: a variant of multi-BSP model for divide-and-conquer algorithms

DAMP '12 Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
A queueing theoretic approach for performance evaluation of low-power multi-core embedded systems

Journal of Parallel and Distributed Computing
Measurement of the latency parameters of the Multi-BSP model: a multicore benchmarking approach

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the advent of multicore and many core architectures, we are facing a problem that is new to parallel computing, namely, the management of hierarchical parallel caches. One major limitation of all earlier models is their inability to model multicore processors with varying degrees of sharing of caches at different levels. We propose a unified memory hierarchy model that addresses these limitations and is an extension of the MHG model developed for a single processor with multi-memory hierarchy. We demonstrate that our unified framework can be applied to a number of multicore architectures for a variety of applications. In particular, we derive lower bounds on memory traffic between different levels in the hierarchy for financial and scientific computations. We also give a multicore algorithms for a financial application that exhibits a constant-factor optimal amount of memory traffic between different cache levels. We implemented the algorithm on a multicore system with two Quad-Core Intel Xeon 5310 1.6GHz processors having a total of 8 cores. Our algorithms outperform compiler optimized and auto-parallelized code by a factor of up to 7.3.