kappa NUMA: A Model for Clusters of SMP-Machines

Authors:
Martin Schmollinger;Michael Kaufmann
Affiliations:
-;-
Venue:
PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
Year:
2001

Citing 14
Cited 1

A bridging model for parallel computation

Communications of the ACM
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Scalable parallel geometric algorithms for coarse grained multicomputers

SCG '93 Proceedings of the ninth annual symposium on Computational geometry
Performance benefits and limitations and limitations of large NUMA multiprocessors

Performance '93 Proceedings of the 16th IFIP Working Group 7.3 international symposium on Computer performance modeling measurement and evaluation
LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
C3: a parallel model for coarse-grained machines

Journal of Parallel and Distributed Computing
Can shared-memory model serve as a bridging model for parallel computation?

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Design and implementation of the NUMAchine multiprocessor

DAC '98 Proceedings of the 35th annual Design Automation Conference
Broadcast and Associative Operations on Fat-Trees

Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Submachine Locality in the Bulk Synchronous Setting (Extended Abstract)

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
The E-BSP Model: Incorporating General Locality and Unbalanced Communication into the BSP Model

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
Truly Efficient Parallel Algorithms: c-Optimal Multisearch for an Extension of the BSP Model (Extended Abstract)

ESA '95 Proceedings of the Third Annual European Symposium on Algorithms
The NUMAchine Multiprocessor

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
The Scalable Coherent Interface (SCI)

IEEE Communications Magazine

Algorithms for SMP-Clusters Dense Matrix-Vector Multiplication

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

The 驴numa model is a new model of parallel computation, which should be used to develop and analyse algorithms for clusters of smp-blocks (symmetrical multiprocessing). smp-blocks are parallel computers with shared memory to which the few processors have uniform access (UMA). The model implies modern directions like hierarchical interconnection, innernode communication (threads and shared memory) and internode communication (message-passing and remote data access). 驴numa is developed on top of the widely accepted Bsp (bulk-synchronous parallel) model. In this paper, we present an examplifying analysis of the personalized one-to-all broadcast. It will be shown that if we transfer optimal algorithms based on the Bsp model directly, there will be a lack of information and so a loss of performance.