Distributed computing with fibre channel fabric
COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
TARA: Topology-Aware Resource Adaptation to Alleviate Congestion in Sensor Networks
IEEE Transactions on Parallel and Distributed Systems
On Strategies for Dynamic Resource Management in Virtualized Server Environments
MASCOTS '07 Proceedings of the 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Entropy: a consolidation manager for clusters
Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Tashi: location-aware cluster management
ACDC '09 Proceedings of the 1st workshop on Automated control for datacenters and clouds
Autonomic virtual resource management for service hosting platforms
CLOUD '09 Proceedings of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing
Quincy: fair scheduling for distributed computing clusters
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
Improving the scalability of data center networks with traffic-aware virtual machine placement
INFOCOM'10 Proceedings of the 29th conference on Information communications
Cloud analytics: do we really need to reinvent the storage stack?
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Improving MapReduce performance in heterogeneous environments
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Reining in the outliers in map-reduce clusters using Mantri
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Purlieus: locality-aware resource allocation for MapReduce in a cloud
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Automatic software deployment in the azure cloud
DAIS'10 Proceedings of the 10th IFIP WG 6.1 international conference on Distributed Applications and Interoperable Systems
GPFS-SNC: an enterprise storage framework for virtual-machine clouds
IBM Journal of Research and Development
Interference and locality-aware task scheduling for MapReduce applications in virtual clusters
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Hi-index | 0.00 |
MapReduce has emerged as a prevailing distributed computation paradigm for enterprise and large-scale data-intensive computing. The model is also increasingly used in the massively-parallel cloud environment, where MapReduce jobs are run on a set of virtual machines (VMs) on pay-as-needed basis. However, MapReduce jobs suffer from performance degradation when running in the cloud due to inefficient resource allocation. In particular, the MapReduce model is designed for and leverages information from the native clusters to operate efficiently, whereas the cloud presents a virtual cluster topology overlying or hiding actual network information. This results in two placement anomalies: loss of data locality and loss of job locality, where jobs are placed physically away from their data or other associated jobs, adversely affecting their performance. In this paper we propose, CAM, a cloud platform that provides an innovative resource scheduler particularly designed for hosting MapReduce applications in the cloud. CAM reconciles both data and VM resource allocation with a variety of competing constraints, such as storage utilization, changing CPU load and network link capacities. CAM uses a flow-network-based algorithm that is able to optimize MapReduce performance under the specified constraints -- not only by initial placement, but by readjusting through VM and data migration as well. Additionally, our platform exposes, otherwise hidden, lower-level topology information to the MapReduce job scheduler so that it makes optimal task assignments. Evaluation of CAM using both micro-benchmarks and simulations on a 23 VM cluster shows that compared to a state-of-the-art resource allocator, our system reduces network traffic and average MapReduce job execution time by a factor of 3 and 8.6, respectively.