U-Net: a user-level network interface for parallel and distributed computing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Communications of the ACM
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs
IEEE Parallel & Distributed Technology: Systems & Technology
PM: An Operating System Coordinated High Performance Communication Library
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
The slab allocator: an object-caching kernel memory allocator
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
Invited Performance of the communication layers of TCP/IP with the Myrinet gigabit LAN
Computer Communications
Bounds on the multi-clients incremental computing for homogeneous decreasing computation sequences
Information Processing Letters
Research note: On the assessment of input streams for incremental network computing
Journal of Parallel and Distributed Computing
A dominant input stream for LUD incremental computing on a contention network
ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Hi-index | 0.00 |
In this paper, we develop a messaging infrastructure, called LLM, to arrive at a robust and efficient low latency message passing infrastructure for kernel-to-kernel communication. The main focus is to overcome the high latencies associated with the conventional communication protocol stack management of TCP/IP. The LLM provides a transport protocol that offers high reliability at the fragment level keeping the acknowledgment overhead low given the high reliability levels of the LAN. The system utilizes some of the architectural facilities provided by the Linux kernel specially designed for optimization in the respective areas. Reliability against fragment losses is ensured by using a low overhead negative acknowledgment scheme. The implementation is in the form of loadable modules extending the Linux OS. In a typical implementation on a cluster of two nodes, each of uniprocessor Intel Pentium 400 MHz on a 10/100 Mbps LAN achieved an average round trip latency of .169ms as compared to the .531ms obtained by ICMP (Ping) protocol. A relative comparison of LLM with others is also provided.