U-Net: a user-level network interface for parallel and distributed computing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
Evaluating design alternatives for reliable communication on high-speed networks
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Virtual Network Transport Protocols for Myrinet
Virtual Network Transport Protocols for Myrinet
Can User-Level Protocols Take Advantage of Multi-CPU NICs?
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
STORM: lightning-fast resource management
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Exploiting task-level concurrency in a programmable network interface
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Firmware-Level Latency Analysis on a Gigabit Network
The Journal of Supercomputing
Improving the I/O performance of intermediate multimedia storage nodes
Multimedia Systems
Spinach: a liberty-based simulator for programmable network interface architectures
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Direct connect device core: design and applications
Integration, the VLSI Journal
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
BCS-MPI: A New Approach in the System Software Design for Large-Scale Parallel Computers
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
A Hardware Acceleration Unit for MPI Queue Processing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Message Passing for Linux Clusters with Gigabit Ethernet Mesh Connections
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
A Cheap and Safe COTS Wormhole for Local Area Networks
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 16 - Volume 17
Hyperplane Grouping and Pipelined Schedules: How to Execute Tiled Loops Fast on Clusters of SMPs
The Journal of Supercomputing
Comparing Ethernet and Myrinet for MPI communication
LCR '04 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems
Performance of 1 and 10 Gigabit Ethernet cards with server quality motherboards
Future Generation Computer Systems - Special issue: High-speed networks and services for data-intensive grids: The DataTAG project
Network Interface Data Caching
IEEE Transactions on Computers
Exploiting NIC architectural support for enhancing IP-based protocols on high-performance networks
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II
Journal of High Speed Networks
Software-Based Adaptive and Concurrent Self-Testing in Programmable Network Interfaces
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
On modelling and analysis of receive livelock and CPU utilization in high-speed networks
International Journal of Computers and Applications
Proceedings of the 20th annual international conference on Supercomputing
RiceNIC: a reconfigurable network interface for experimental research and education
Proceedings of the 2007 workshop on Experimental computer science
Performance evaluation of offloading software modules to cluster network
PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
Software-Based Failure Detection and Recovery in Programmable Network Interfaces
IEEE Transactions on Parallel and Distributed Systems
Performance analysis and comparison of interrupt-handling schemes in gigabit networks
Computer Communications
Tapping into the fountain of CPUs: on operating system support for programmable devices
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
International Journal of High Performance Computing and Networking
On chip novel video streaming system for bi-network multicasting protocols
Integration, the VLSI Journal
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design
Proceedings of the 23rd international conference on Supercomputing
Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization
Proceedings of the 23rd international conference on Supercomputing
Fast-path I/O architecture for high performance streaming server
The Journal of Supercomputing
Performance of 1 and 10 Gigabit Ethernet cards with server quality motherboards
Future Generation Computer Systems - Special issue: High-speed networks and services for data-intensive grids: The DataTAG project
Cable-TV-based home video streaming system: practice and experience
IEEE Network: The Magazine of Global Internetworking
Wireless network cloud: architecture and system requirements
IBM Journal of Research and Development
VoIP performance on multicore platforms
IBM Journal of Research and Development
Seekable sockets: a mechanism to reduce copy overheads in TCP-based messaging
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
iWarp protocol kernel space software implementation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
ISPDC'03 Proceedings of the Second international conference on Parallel and distributed computing
High-performance message-passing over generic Ethernet hardware with Open-MX
Parallel Computing
Application-specific service technologies for commodity operating systems in real-time environments
ACM Transactions on Embedded Computing Systems (TECS)
A case for non-blocking collective operations
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Design and implementation of zero-copy data path for efficient file transmission
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
On a NIC's operating system, schedulers and high-performance networking applications
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Assessing MPI performance on QsNetIIt
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
NIC-NET: a host-independent network solution for high-end network servers
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Lightweight messages: true zero-copy communication for commodity gigabit ethernet
EUC'06 Proceedings of the 2006 international conference on Emerging Directions in Embedded and Ubiquitous Computing
On the accuracy of two analytical models for evaluating the performance of Gigabit Ethernet hosts
Information Sciences: an International Journal
Proceedings of the 3rd International Conference on Future Energy Systems: Where Energy, Computing and Communication Meet
Less is more: trading a little bandwidth for ultra-low latency in the data center
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
A fast and resource-conscious MPI message queue mechanism for large-scale jobs
Future Generation Computer Systems
Hi-index | 0.00 |
Modern interconnects like Myrinet and Gigabit Ethernet offer Gb/s speeds which has put the onus of reducing the communication latency on messaging software. This has led to the development of OS bypass protocols which removed the kernel from the critical path and hence reduced the end-to-end latency. With the advent of programmable NICs, many aspects of protocol processing can be offloaded from user space to the NIC leaving the host processor to dedicate more cycles to the application. Many host-offload messaging systems exist for Myrinet; however, nothing similar exits for Gigabit Ethernet. In this paper we propose Ethernet Message Passing (EMP), a completely new zero-copy, OS-bypass messaging layer for Gigabit Ethernet on Alteon NICs where the entire protocol processing is done at the NIC. This messaging system delivers very good performance (latency of 23 us, and throughput of 880 Mb/s). To the best of our knowledge, this is the first NIC-level implementation of a zero-copy message passing layer for Gigabit Ethernet.