Memory resource management in VMware ESX server
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Stork: Making Data Placement a First Class Citizen in the Grid
ICDCS '04 Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04)
Co-scheduling of computation and data on computer clusters
SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
Difference engine: harnessing memory redundancy in virtual machines
Communications of the ACM
Accelerating parallel analysis of scientific simulation data via Zazen
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Satori: enlightened page sharing
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Scarlett: coping with skewed content popularity in mapreduce clusters
Proceedings of the sixth conference on Computer systems
Toward Efficient and Simplified Distributed Data Intensive Computing
IEEE Transactions on Parallel and Distributed Systems
A performance evaluation of Azure and Nimbus clouds for scientific applications
Proceedings of the 2nd International Workshop on Cloud Computing Platforms
PACMan: coordinated memory caching for parallel jobs
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
Virtualization provides many benefits for data intensive workloads including security, performance isolation, and ease of management and configuration. Unfortunately, current VM technology prevents taking advantage of sharing opportunities, resulting in substantial network traffic and application slowdown. Octopus is a new framework for running data intensive applications on virtualized datacenters. Octopus provides efficient file sharing across VMs running on the same physical host and optimizes the placement of VMs in the cluster to maximize sharing opportunities. Our experiments with a suite of bioinformatics and natural language processing applications show that Octopus reduces network transfer by up to 83% and total runtime by up to 55%.