A bridging model for parallel computation
Communications of the ACM
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Scalable parallel geometric algorithms for coarse grained multicomputers
SCG '93 Proceedings of the ninth annual symposium on Computational geometry
Performance benefits and limitations and limitations of large NUMA multiprocessors
Performance '93 Proceedings of the 16th IFIP Working Group 7.3 international symposium on Computer performance modeling measurement and evaluation
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
C3: a parallel model for coarse-grained machines
Journal of Parallel and Distributed Computing
Can shared-memory model serve as a bridging model for parallel computation?
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Design and implementation of the NUMAchine multiprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Broadcast and Associative Operations on Fat-Trees
Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Submachine Locality in the Bulk Synchronous Setting (Extended Abstract)
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
The E-BSP Model: Incorporating General Locality and Unbalanced Communication into the BSP Model
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
ESA '95 Proceedings of the Third Annual European Symposium on Algorithms
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
The Scalable Coherent Interface (SCI)
IEEE Communications Magazine
Algorithms for SMP-Clusters Dense Matrix-Vector Multiplication
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Hi-index | 0.00 |
The 驴numa model is a new model of parallel computation, which should be used to develop and analyse algorithms for clusters of smp-blocks (symmetrical multiprocessing). smp-blocks are parallel computers with shared memory to which the few processors have uniform access (UMA). The model implies modern directions like hierarchical interconnection, innernode communication (threads and shared memory) and internode communication (message-passing and remote data access). 驴numa is developed on top of the widely accepted Bsp (bulk-synchronous parallel) model. In this paper, we present an examplifying analysis of the personalized one-to-all broadcast. It will be shown that if we transfer optimal algorithms based on the Bsp model directly, there will be a lack of information and so a loss of performance.