Optimizing the stretch of independent tasks on a cluster: From sequential tasks to moldable tasks

  • Authors:
  • Erik Saule;Doruk Bozdağ;ímit V. Çatalyürek

  • Affiliations:
  • Department of Biomedical Informatics, The Ohio State University, Columbus OH, USA;Department of Biomedical Informatics, The Ohio State University, Columbus OH, USA;Department of Biomedical Informatics, The Ohio State University, Columbus OH, USA and Department of Electrical and Computer Engineering, The Ohio State University, Columbus OH, USA

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the problem of scheduling non-preemptive moldable tasks to minimize the stretch of the tasks in an online non-clairvoyant setting. To the best of the authors' knowledge, this problem has never been studied before. To tackle this problem, first the sequential subproblem is studied through the lens of the approximation theory. An algorithm, called DASEDF, is proposed and, through simulations, it is shown to outperform the first-come, first-served scheme. Furthermore, it is observed that machine availability is the key to getting good stretch values. Then, the moldable task scheduling problem is considered, and, by leveraging the results from the sequential case, another algorithm, DBOS, is proposed to optimize the stretch while scheduling moldable tasks. This work is motivated by a task scheduling problem in the context of parallel short sequence mapping which has important applications in biology and genetics. The proposed DBOS algorithm is evaluated both on synthetic data sets that represent short sequence mapping requests and on data sets generated using log files of real production clusters. The results show that the DBOS algorithm significantly outperforms the two state-of-the-art task scheduling algorithms on stretch optimization.