GPUfs: Integrating a file system with GPUs

Authors:
Mark Silberstein;Bryan Ford;Idit Keidar;Emmett Witchel
Affiliations:
University of Texas at Austin;Yale University;Technion;University of Texas at Austin
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
2014

Citing 24
Cited 0

Principles of database buffer management

ACM Transactions on Database Systems (TODS)
Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
TreadMarks: Shared Memory Computing on Networks of Workstations

Computer
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Brook for GPUs: stream computing on graphics hardware

ACM SIGGRAPH 2004 Papers
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Programming using RapidMind on the Cell BE

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Tapping into the fountain of CPUs: on operating system support for programmable devices

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
hiCUDA: a high-level directive-based language for GPU programming

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
The multikernel: a new OS architecture for scalable multicore systems

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Helios: heterogeneous multiprocessing with satellite kernels

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The Art of Multiprocessor Programming

The Art of Multiprocessor Programming
An asymmetric distributed shared memory model for heterogeneous parallel systems

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
PacketShader: a GPU-accelerated software router

Proceedings of the ACM SIGCOMM 2010 conference
FlexSC: flexible system call scheduling with exception-less system calls

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

Concurrency and Computation: Practice & Experience - Euro-Par 2009
Operating systems must support GPU abstractions

HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
GPU-to-CPU callbacks

Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
PTask: operating system abstractions to manage GPUs as compute devices

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
GPUs and the Future of Parallel Computing

IEEE Micro
Scientific and Engineering Computing Using ATI Stream Technology

Computing in Science and Engineering
A file I/O system for many-core based clusters

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Gdev: first-class GPU resource management in the operating system

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
RSVM: a region-based software virtual memory for GPU

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

As GPU hardware becomes increasingly general-purpose, it is quickly outgrowing the traditional, constrained GPU-as-coprocessor programming model. This article advocates for extending standard operating system services and abstractions to GPUs in order to facilitate program development and enable harmonious integration of GPUs in computing systems. As an example, we describe the design and implementation of GPUFs, a software layer which provides operating system support for accessing host files directly from GPU programs. GPUFs provides a POSIX-like API, exploits GPU parallelism for efficiency, and optimizes GPU file access by extending the host CPU's buffer cache into GPU memory. Our experiments, based on a set of real benchmarks adapted to use our file system, demonstrate the feasibility and benefits of the GPUFs approach. For example, a self-contained GPU program that searches for a set of strings throughout the Linux kernel source tree runs over seven times faster than on an eight-core CPU.