Scheduling Communication on an SMP Node Parallel Machine

Authors:
Babak Falsafi;David A. Wood
Affiliations:
-;-
Venue:
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Year:
1997

Citing 0
Cited 13

pSNOW: a tool to evaluate architectural issues for NOW environments

ICS '97 Proceedings of the 11th international conference on Supercomputing
Hardware Support for Flexible Distributed Shared Memory

IEEE Transactions on Computers
Using network interface support to avoid asynchronous protocol processing in shared virtual memory systems

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Responsiveness without interrupts

ICS '99 Proceedings of the 13th international conference on Supercomputing
Multigrain shared memory

ACM Transactions on Computer Systems (TOCS)
Accelerating shared virtual memory via general-purpose network interface support

ACM Transactions on Computer Systems (TOCS)
Cost effectiveness of an adaptable computing cluster

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Shared virtual memory clusters: bridging the cost-performance gap between SMPs and hardware DSM systems

Journal of Parallel and Distributed Computing
An Analysis of the Cost Effectiveness of an Adaptable Computing Cluster

Cluster Computing
Distributed Shared Arrays: An Integration of Message Passing and Multithreading on SMP Clusters

The Journal of Supercomputing
Efficient Direct User Level Sockets for an Intel® Xeon" Processor Based TCP On-Load Engine

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Evaluating scheduling policies for fine-grain communication protocols on a cluster of SMPs

Journal of Parallel and Distributed Computing
Scheduler implementation in MP SoC design

Proceedings of the 2005 Asia and South Pacific Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed-memory parallel computers and networks of workstations (NOWs) both rely on efficient communication over increasingly high-speed networks. Software communication protocols are often the performance bottleneck. Several current and proposed parallel systems address this problem by dedicating one general-purpose processor in a symmetric multiprocessor (SMP) node specifically for protocol processing. This scheduling convention reduces communication latency and increases effective bandwidth, but also reduces the peak performance since the dedicated processor no longer performs computation. In this paper, we study a parallel machine with SMP nodes and compare two protocol processing policies: Fixed, which uses a dedicated protocol processor; and Floating, where all processors perform both computation and protocol processing. The results from synthetic microbenchmarks and five macrobenchmarks show that: i) a dedicated protocol processor benefits light-weight protocols much more than heavy-weight protocols; ii) Fixed improves performance over Floating when communication becomes the bottleneck, which is more likely when the application is very communication-intensive, overheads are very high, or there are multiple (i.e., more than two) processors per node; iii) a system with optimal cost-effectiveness is likely to include a dedicated protocol processor, at least for light-weight protocols.