Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
MapReduce for Data Intensive Scientific Analyses
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Dynamic load balancing for I/O-intensive applications on clusters
ACM Transactions on Storage (TOS)
Packing the most onto your cloud
Proceedings of the first international workshop on Cloud data management
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
ParaTimer: a progress indicator for MapReduce DAGs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Improving MapReduce performance in heterogeneous environments
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Breaking the MapReduce Stage Barrier
CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
Dynamic proportional share scheduling in Hadoop
JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
Scheduling Hadoop Jobs to Meet Deadlines
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Mars: Accelerating MapReduce with Graphics Processors
IEEE Transactions on Parallel and Distributed Systems
Improving Data Locality of MapReduce by Scheduling in Homogeneous Computing Environments
ISPA '11 Proceedings of the 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications
Scheduling Mixed Real-Time and Non-real-Time Applications in MapReduce Environment
ICPADS '11 Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed Systems
Hi-index | 0.00 |
The current works about MapReduce task scheduling with deadline constraints neither take the differences of Map and Reduce task, nor the cluster's heterogeneity into account. This paper proposes an extensional MapReduce Task Scheduling algorithm for Deadline constraints in Hadoop platform: MTSD. It allows user specify a job's deadline and tries to make the job be finished before the deadline. Through measuring the node's computing capacity, a node classification algorithm is proposed in MTSD. This algorithm classifies the nodes into several levels in heterogeneous clusters. Under this algorithm, we firstly illuminate a novel data distribution model which distributes data according to the node's capacity level respectively. The experiments show that the node classification algorithm can improved data locality observably to compare with default scheduler and it also can improve other scheduler's locality. Secondly, we calculate the task's average completion time which is based on the node level. It improves the precision of task's remaining time evaluation. Finally, MTSD provides a mechanism to decide which job's task should be scheduled by calculating the Map and Reduce task slot requirements.