Computers and Intractability; A Guide to the Theory of NP-Completeness
Computers and Intractability; A Guide to the Theory of NP-Completeness
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A scalable, commodity data center network architecture
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Dcell: a scalable and fault-tolerant network structure for data centers
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Server-storage virtualization: integration and load balancing in data centers
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
pMapper: power and migration cost aware application placement in virtualized systems
Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Tashi: location-aware cluster management
ACDC '09 Proceedings of the 1st workshop on Automated control for datacenters and clouds
MapReduce optimization using regulated dynamic prioritization
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
VL2: a scalable and flexible data center network
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Safe and effective fine-grained TCP retransmissions for datacenter communication
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Understanding TCP incast throughput collapse in datacenter networks
Proceedings of the 1st ACM workshop on Research on enterprise networking
Quincy: fair scheduling for distributed computing clusters
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Towards automatic optimization of MapReduce programs
Proceedings of the 1st ACM symposium on Cloud computing
Topology-aware resource allocation for data-intensive workloads
Proceedings of the first ACM asia-pacific workshop on Workshop on systems
Towards optimizing hadoop provisioning in the cloud
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Hedera: dynamic flow scheduling for data center networks
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Improving MapReduce performance in heterogeneous environments
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Reining in the outliers in map-reduce clusters using Mantri
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
MapReduce in the Clouds for Science
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Black-box and gray-box strategies for virtual machine migration
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Investigation of data locality and fairness in MapReduce
Proceedings of third international workshop on MapReduce and its Applications Date
Locality-aware dynamic VM reconfiguration on MapReduce clouds
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Investigation of Data Locality in MapReduce
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Opening up black box networks with CloudTalk
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Hierarchical merge for scalable MapReduce
Proceedings of the 2012 workshop on Management of big data systems
Minimizing Cost of Virtual Machines for Deadline-Constrained MapReduce Applications in the Cloud
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
ClouDiA: a deployment advisor for public clouds
Proceedings of the VLDB Endowment
Interference and locality-aware task scheduling for MapReduce applications in virtual clusters
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Job scheduling for optimizing data locality in Hadoop clusters
Proceedings of the 20th European MPI Users' Group Meeting
Choreo: network-aware task placement for cloud applications
Proceedings of the 2013 conference on Internet measurement conference
Boosting energy efficiency with mirrored data block replication policy and energy scheduler
ACM SIGOPS Operating Systems Review
Joint optimization of overlapping phases in MapReduce
Performance Evaluation
Hi-index | 0.00 |
We present Purlieus, a MapReduce resource allocation system aimed at enhancing the performance of MapReduce jobs in the cloud. Purlieus provisions virtual MapReduce clusters in a locality-aware manner enabling MapReduce virtual machines (VMs) access to input data and importantly, intermediate data from local or close-by physical machines. We demonstrate how this locality-awareness during both map and reduce phases of the job not only improves runtime performance of individual jobs but also has an additional advantage of reducing network traffic generated in the cloud data center. This is accomplished using a novel coupling of, otherwise independent, data and VM placement steps. We conduct a detailed evaluation of Purlieus and demonstrate significant savings in network traffic and almost 50% reduction in job execution times for a variety of workloads.