Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Diagnosing performance overheads in the xen virtual machine environment
Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
A case for high performance computing with virtual machines
Proceedings of the 20th annual international conference on Supercomputing
I/O processing in a virtualized platform: a simulation-driven approach
Proceedings of the 3rd international conference on Virtual execution environments
Characterization & analysis of a server consolidation benchmark
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Virtual machine aware communication libraries for high performance computing
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Performance implications of virtualizing multicore cluster machines
Proceedings of the 2nd workshop on System-level virtualization for high performance computing
Software techniques to improve virtualized I/O performance on multi-core systems
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Evaluation of the HPC challenge benchmarks in virtualized environments
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Hi-index | 0.00 |
In this paper, we evaluate the overheads of virtualization in commercial multicore architectures with shared memory and MPI-based applications. We find that the non-uniformity of memory latencies affects the performance of virtualized systems significantly. Due to the lack of support for non-uniform memory access (NUMA) in the Xen hypervisor, shared memory applications suffer from a significant performance degradation by virtualization. MPI-based applications show more resilience on sub-optimal NUMA memory allocation and virtual machine (VM) scheduling. However, using multiple VMs on a physical system for the same instance of MPI applications may adversely affect the overall performance, by increasing I/O operations through the domain 0 VM. As the number of cores increases on a chip, the cache hierarchy and external memory will become more asymmetric. As such non-uniformity in memory systems increases, NUMA and cache awareness in VM scheduling will be critical for shared memory applications.