Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
VMM-independent graphics acceleration
Proceedings of the 3rd international conference on Virtual execution environments
The Definitive Guide to the Xen Hypervisor (Prentice Hall Open Source Software Development Series)
The Definitive Guide to the Xen Hypervisor (Prentice Hall Open Source Software Development Series)
Scheduling I/O in virtual machine monitors
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Inter-domain socket communications supporting high performance and full binary compatibility on Xen
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Virtual machine aware communication libraries for high performance computing
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
XenLoop: a transparent high performance inter-vm network loopback
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Exploiting Partial Runtime Reconfiguration for High-Performance Reconfigurable Computing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
XenSocket: a high-throughput interdomain transport for virtual machines
Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware
GViM: GPU-accelerated virtual machines
Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing
Accelerating Compute-Intensive Applications with GPUs and FPGAs
SASP '08 Proceedings of the 2008 Symposium on Application Specific Processors
GPU virtualization on VMware's hosted I/O architecture
ACM SIGOPS Operating Systems Review
A 32x32x32, spatially distributed 3D FFT in four microseconds on Anton
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Toward a runtime system for reconfigurable computers: a virtualization approach
Proceedings of the Conference on Design, Automation and Test in Europe
Journal of Systems Architecture: the EUROMICRO Journal
A GPGPU transparent virtualization component for high performance computing clouds
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework
Proceedings of the 20th international symposium on High performance distributed computing
vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines
IEEE Transactions on Computers
Virtualization of reconfigurable coprocessors in HPRC systems with multicore architecture
Journal of Systems Architecture: the EUROMICRO Journal
Bringing Virtualization to the x86 Architecture with the Original VMware Workstation
ACM Transactions on Computer Systems (TOCS)
Hi-index | 0.00 |
In this paper we present pvFPGA, the first system design solution for virtualizing an FPGA-based hardware accelerator on the x86 platform. Our design adopts the Xen virtual machine monitor (VMM) to build a paravirtualized environment, and a Xilinx Virtex-6 as an FPGA accelerator. The accelerator communicates with the x86 server via PCI Express (PCIe). In comparison to the recent accelerator virtualization solutions which primarily intercept and redirect API calls to the hosted or privileged domain's user space, pvFPGA virtualizes an FPGA accelerator directly at the lower device driver level. This gives rise to higher efficiency and lower overhead. In pvFPGA, each unprivileged domain allocates a shared data pool for both user-kernel and inter-domain data transfer. In addition, we propose a new component, the coprovisor, which enables multiple domains to simultaneously access an FPGA accelerator. The experimental results have shown that 1) pvFPGA achieves close-to-zero overhead compared to accessing the FPGA accelerator without the VMM layer, 2) the FPGA accelerator is successfully shared by multiple domains, and 3) distributing different maximum data transfer bandwidths to different domains is achieved by regulating the size of the shared data pool at the split driver loading time.