High performance messaging on workstations: Illinois fast messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
EMP: zero-copy OS-bypass NIC-driven gigabit ethernet message passing
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
A High-Speed Protocol Parallel Implementation: Design and Analysis
Proceedings of the IFIP TC6/WG6.4 Fourth International Conference on High Performance Networking IV
Performance issues in parallelized network protocols
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Exploiting task-level concurrency in a programmable network interface
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Platform Overlays: enabling in-network stream processing in large-scale distributed applications
NOSSDAV '05 Proceedings of the international workshop on Network and operating systems support for digital audio and video
Comparing Ethernet and Myrinet for MPI communication
LCR '04 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems
Addressing data compatibility on programmable network platforms
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Exploiting NIC architectural support for enhancing IP-based protocols on high-performance networks
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II
The Journal of Supercomputing
Advanced networking services for distributed multimedia streaming applications
Multimedia Tools and Applications
RiceNIC: a reconfigurable network interface for experimental research and education
Proceedings of the 2007 workshop on Experimental computer science
Hi-index | 0.00 |
User-level protocols and their implementations on programmable network interface cards (NICs) have been alleviating the communication bottleneck for high speed interconnects. Most of the user-level protocols developed so far have been based on single-CPU NICs. One of the more popular current generation Gigabit Ethernet NICs includes two CPUs, though. This raises an open challenge whether performance of user-level protocols can be improved by taking advantage of a multi-CPU NIC. In this paper, we analyze the intrinsic issues associated with such a challenge and explore different parallelization and pipelining schemes to enhance the performance of our earlier developed EMP protocol for single-CPU Alteon NICs. Performance evaluation results indicate that parallelizing the receive path of the protocol can deliver 964 Mbps of bandwidth, close to the maximum achievable on Gigabit Ethernet. This scheme also delivers up to 8% improvement in latency for a range of message sizes. Parallelizing the send path leads to 17% improvement in bidirectional bandwidth.