Algorithms for DMA communications

Authors:
Alexander P. Kemalov
Affiliations:
-
Venue:
CompSysTech '04 Proceedings of the 5th international conference on Computer systems and technologies
Year:
2004

Citing 7
Cited 0

Integration of message passing and shared memory in the Stanford FLASH multiprocessor

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Design choices in the SHRIMP system: an empirical study

Proceedings of the 25th annual international symposium on Computer architecture
Functional divisions in the Piglet multiprocessor operating system

Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications
A Network Co-processor-Based Approach to Scalable Media Streaming in Servers

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Aspects of Cache Memory and Instruction

Aspects of Cache Memory and Instruction
Coordinated thread scheduling for workstation clusters under windows NT

NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Scalable kernel performance for internet servers under realistic loads

ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Direct Memory Access /DMA/ is previously used to transfer data between the main memory of host computer /PC/ and the network → to another one PC. This method is used to free the processor from the burden of transfer operations. DMA procedures commonly are initiated by the operating system kernel to separate one application and its data with another.A cluster of PCs architecture suggests that interconnections get faster and overhead and latency in networks go down while operating system operations get slower. In clusters these factors are very important because an intensive data transfer between hosts. These trends imply that DMA operation becomes slower /using operating system kernel/, compared to interconnection network. The important aspect here is ability to transfer data directly between the network interface and application buffers. Such direct data path requires the network interface "to know" the virtual-to-physical address translation of a user buffer.This paper proposes several algorithms that allow applications to start DMA operation without OS kernel. The algorithms allow user-level applications to have direct access to the DMA engine and TLB library. This approach is achieved without requiring changes to the OS kernel. Using our algorithms, DMA operation can be initiated faster /in comparison to OS kernel/.