A high-speed network interface for distributed-memory systems: architecture and applications

Authors:
Peter Steenkiste
Affiliations:
Carnegie Mellon Univ., Pittsburgh, PA
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
1997

Citing 26
Cited 1

A microprocessor-based hypercube supercomputer

IEEE Micro
Low-level vision on warp and the apply programming model

Parallel computation and computers for artificial intelligence
Architecture and Applications of the Connection Machine

Computer
Warp: an integrated solution of high-speed parallel computing

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
High speed networking at Cray research

ACM SIGCOMM Computer Communication Review
Asynchronous transfer mode: solution for broadband ISDN

Asynchronous transfer mode: solution for broadband ISDN
A new approach for automatic parallelization of blocked linear Algebra computations

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Supporting real-time applications in an Integrated Services Packet Network: architecture and mechanism

SIGCOMM '92 Conference proceedings on Communications architectures & protocols
Analyzing communication latency using the Nectar communication processor

SIGCOMM '92 Conference proceedings on Communications architectures & protocols
A programmable HIPPI interface for a graphics supercomputer

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Experiments with a gigabit neuroscience application on the CM-2

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Latency and bandwidth considerations in parallel robotics image processing

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Design and Evaluation of primitives for Parallel I/O

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
MPI: a message passing interface

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A systematic approach to host interface design for high-speed networks

Computer
Architecture implications of high-speed I/O for distributed-memory computers

ICS '94 Proceedings of the 8th international conference on Supercomputing
Software support for outboard buffering and checksumming

SIGCOMM '95 Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Distributing a chemical process optimization application over a gigabit network

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Supercomputing with transputers—past, present and future

ICS '90 Proceedings of the 4th international conference on Supercomputing
Supporting systolic and memory communication in iWarp

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Microprocessor file system interfaces

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Task Parallelism in a High Performance Fortran Framework

IEEE Parallel & Distributed Technology: Systems & Technology
Processing Element Design for a Parallel Computer

IEEE Micro
Physical Schemas for Large Multidimensional Arrays in Scientific Computing Applications

Proceedings of the Seventh International Working Conference on Scientific and Statistical Database Management
TCP/IP on the Parallel Protocol Engine

Proceedings of the IFIP TC6/WG6.4 Fourth International Conference on High Performance Networking IV
A Host Interface Architecture for High-Speed Networks

Proceedings of the IFIP TC6/WG6.4 Fourth International Conference on High Performance Networking IV

An Efficient On-Chip Network Interface Offering Guaranteed Services, Shared-Memory Abstraction, and Flexible Network Configuration

Proceedings of the conference on Design, automation and test in Europe - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed-memory systems have traditionally had great difficulty performing network I/O at rates proportional to their computational power. The problem is that the network interface has to support network I/O for a supercomputer, using computational and memory bandwidth resources similar to those of a workstation. As a result, the network interface becomes a bottleneck. In this article we present an I/O architecture that addresses these problems and supports high-speed network I/O on distributed-memory systems. The key to good performance is to partition the work appropriately between the system and the network interface. Some communication tasks are performed on the distributed-memory parallel system, since it is more powerful and less likely to become a bottleneck than the network interface. Tasks that do not parallelize well are performed on the network interface, and hardware support is provided for the most time-critical operations. This architecture has been implemented for the iWarp distributed-memory system and has been used by a number of applications. We describe this implementaiton, present performance results, and use application examples to validated the main features of the I/O architecture.