Technologies for low latency interconnection switches
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Performance of the Firefly RPC
ACM Transactions on Computer Systems (TOCS)
On the design of deadlock-free adaptive routing algorithms for multicomputers: design methodologies
PARLE '91 Proceedings on Parallel architectures and languages Europe : volume I: parallel architectures and algorithms: volume I: parallel architectures and algorithms
User-level interprocess communication for shared memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Chaos router: architecture and performance
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The turn model for adaptive routing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
A tightly-coupled processor-network interface
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A family of routing and communication chips based on the Mosaic
Proceedings of the 1993 symposium on Research on integrated systems
Anatomy of a message in the Alewife multiprocessor
ICS '93 Proceedings of the 7th international conference on Supercomputing
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Compressionless routing: a framework for adaptive and fault-tolerant routing
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
IEEE Transactions on Parallel and Distributed Systems
Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels
IEEE Transactions on Parallel and Distributed Systems
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
Remote queues: exposing message queues for optimization and atomicity
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
NIFDY: a low overhead, high throughput network interface
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ICS '96 Proceedings of the 10th international conference on Supercomputing
PP-MESS-SIM: A Flexible and Extensible Simulator for Evaluating Multicomputer Networks
IEEE Transactions on Parallel and Distributed Systems
Evaluating design alternatives for reliable communication on high-speed networks
ACM SIGPLAN Notices
Evaluating design alternatives for reliable communication on high-speed networks
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
A framework for performance-based program partitioning
Progress in computer research
User-space communication: a quantitative study
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
A framework for performance-based program partitioning
Progress in computer research
High Performance Network of PC Cluster Maestro
Cluster Computing
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs
IEEE Parallel & Distributed Technology: Systems & Technology
Impact of Virtual Channels and Adaptive Routing on Application Performance
IEEE Transactions on Parallel and Distributed Systems
Alleviating Consumption Channel Bottleneck in Wormhole-Routed k-ary n-Cube Systems
IEEE Transactions on Parallel and Distributed Systems
Software Techniques for Improving MPP Bulk-Transfer Performance
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Software Support for Virtual Memory-Mapped Communication
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Journal of Parallel and Distributed Computing
Cluster communication protocols for parallel-programming systems
ACM Transactions on Computer Systems (TOCS)
Hyperplane Grouping and Pipelined Schedules: How to Execute Tiled Loops Fast on Clusters of SMPs
The Journal of Supercomputing
Exploring the Design Space of Self-Regulating Power-Aware On/Off Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
Network interfaces for programmable NICs and multicore platforms
Computer Networks: The International Journal of Computer and Telecommunications Networking
rMPI: message passing on multicore processors with on-chip interconnect
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Exploiting programmable network interfaces for parallel query execution in workstation clusters
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
A study of application-level recovery methods for transient network faults
ScalA '13 Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems
Hi-index | 0.00 |
Despite improvements in network interfaces and software messaging layers, software communication overhead still dominates the hardware routing cost in most systems. In this study, we identify the sources of this overhead by analyzing software costs of typical communication protocols built atop the active messages layer on the CM-5. We show that up to 50–70% of the software messaging costs are a direct consequence of the gap between specific network features such as arbitrary delivery order, finite buffering, and limited fault-handling, and the user communication requirements of in-order delivery, end-to-end flow control, and reliable transmission. However, virtually all of these costs can be eliminated if routing networks provide higher-level services such as in-order delivery, end-to-end flow control, and packet-level fault-tolerance. We conclude that significant cost reductions require changing the constraints on messaging layers: we propose designing networks and network interfaces which simplify or replace software for implementing user communication requirements.