Affinity-aware DMA buffer management for reducing off-chip memory access

Authors:
Qi Zhong;Xuetao Guan;Tao Huang;Xu Cheng;Keyi Wang
Affiliations:
Peking University, P. R. China;Peking University, P. R. China;Peking University, P. R. China;Peking University, P. R. China;Peking University, P. R. China
Venue:
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Year:
2012

Citing 11
Cited 0

The Parallel Protocol Engine

IEEE/ACM Transactions on Networking (TON)
Predicting whole-program locality through reuse distance analysis

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Direct Cache Access for High Bandwidth Network I/O

Proceedings of the 32nd annual international symposium on Computer Architecture
Zero-copy TCP in Solaris

ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Impact of Cache Coherence Protocols on the Processing of Network Traffic

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Maintaining I/O Data Coherence in Embedded Multicore Systems

IEEE Micro
Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning

PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
An Improved Method of Zero-Copy Data Transmission in the High Speed Network Environment

MINES '09 Proceedings of the 2009 International Conference on Multimedia Information Networking and Security - Volume 02
Reinventing scheduling for multicore systems

HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Cache injection for parallel applications

Proceedings of the 20th international symposium on High performance distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is well recognized that moving I/O data in/out memory has become critical for high bandwidth devices. Specifically, the embedded system, with limited cache size and simple architecture, consumes a large amount of CPU cycles for off-chip memory access. The work presented in this paper addresses this problem through an Affinity-aware DMA Buffer management strategy, called ADB, requiring no change to underlying hardware. We introduce the concept of buffer affinity describes the data location of the recently released DMA buffer in the memory hierarchy. The more data in cache, the higher affinity the buffer has. Based on the character of the embedded system, we can identify buffer affinity at runtime. Using this online profiling, ADB allocates buffer with different affinity. For output processes, ADB allocates the high affinity buffer to reduce off-chip memory access when OS copies data from the user buffer to the kernel buffer. For input processes, ADB allocates the low affinity buffer to skip part of cache invalidation operations for maintaining I/O coherence. Measurements show that ADB, implemented in the Linux-2.6.32 kernel and running on a 1GHz UniCore-2 processor, improves the performance of network related programs from 5.2% to 8.8%.