Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Implementing network protocols at user level
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
Experiences with a high-speed network adaptor: a software perspective
SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
Software versus hardware shared-memory implementation: a case study
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
SPLASH: Stanford parallel applications for shared-memory
SPLASH: Stanford parallel applications for shared-memory
Hi-index | 0.00 |
Networks of workstations provide an economic solution for scalable computing because they do not require specialized components. Even though recent advances have shown that it is possible to obtain high bandwidth between applications, interconnect latency remains a serious concern. In this paper we present CNI, a cluster network interface that not only provides both low latency and high bandwidth but also efficiently supports multiple programming paradigms. This is done by functionally coupling the network adaptor board more closely to the CPU without changing the standard workstation architecture. CNI results in performance gains for applications, substantially reducing communication overhead and delay.