Memory coherence in shared virtual memory systems
ACM Transactions on Computer Systems (TOCS)
Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
SoftFLASH: analyzing the performance of clustered distributed virtual shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
IO-lite: a unified I/O buffering and caching system
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
MultiView and Millipage — fine-grain sharing in page-based DSMs
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Experiences with VI communication for database storage
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs
IEEE Parallel & Distributed Technology: Systems & Technology
ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
Transparent Adaptation of Sharing Granularity in MultiView-Based DSM Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Structure and Performance of the Direct Access File System
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
High-Performance Memory-Based Web Servers: Kernel and User-Space Performance
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Home-Based SVM Protocols for SMP Clusters: Design and Performance
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Analysis of the memory registration process in the mellanox infiniband software stack
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Hi-index | 0.00 |
The Infiniband (IB) System Area Network (SAN) enables applications to access hardware directly from user level, reducing the overhead of user-kernel crossings during data transfer. However, distributed applications that exhibit close coupling between network and OS services may benefit from accessing IB from the kernel through IB's native Verbs interface, which permits tight integration of these services. We assess this approach using a sequential-consistency Distributed Shared Memory (DSM) system as an example. We first develop primitives that abstract the low-level communication and kernel details, and efficiently serve the application's communication, memory, and scheduling needs. Next, we combine the primitives to form a kernel DSM protocol. The approach is evaluated using our full-fledged Linux kernel DSM implementation over Infiniband. We show that overheads are reduced substantially, and overall application performance is improved in terms of both absolute execution time and scalability relative to an entirely user level implementation.