Predicting execution time of machine learning tasks for scheduling

  • Authors:
  • Rattan Priya;Bruno Feres de Souza;André L. D. Rossi;André C. P. L. F. de Carvalho

  • Affiliations:
  • Computer Science Engineering, Indira Gandhi Institute of Technology, GGSIPU, New Delhi, India;Computer Science Department, Insitutute of Mathematics and Computer Sciences, University of São Paulo, São Carlos-SP, Brazil;Computer Science Department, Insitutute of Mathematics and Computer Sciences, University of São Paulo, São Carlos-SP, Brazil;Computer Science Department, Insitutute of Mathematics and Computer Sciences, University of São Paulo, São Carlos-SP, Brazil

  • Venue:
  • International Journal of Hybrid Intelligent Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Lately, many academic and industrial fields have shifted their research focus from data acquisition to data analysis. This transition has been facilitated by the usage of Machine Learning ML techniques to automatically identify patterns and extract non-trivial knowledge from data. The experimental procedures associated with that are usually complex and computationally demanding. To deal with such scenario, Distributed Heterogeneous Computing DHC systems can be employed. In order to fully benefit from DHT facilities, a suitabble scheduling policy should be applied to decide how to allocate tasks into the available resources. An important step for such is to guess how long an application would take to execute. In this paper, we present an approach for predicting execution time specifically of ML tasks. It employs a metalearning framework to relate characteristics of datasets and current machine state to actual execution time. An empirical study was conducted using 78 publicly available datasets, 6 ML algorithms and 4 meta-regressors. Experimental results show that our approach outperforms a commonly used baseline method. After establishing SVM as the most promising meta-regressor, we employed its predictions to actually build schedule plans. In a simulation considering a small scale DHC enviroment, a simple Genetic Algorithm based scheduler was employed for task allocation, leading to minimized overall completion time. These achievements indicate the potential of meta-learning to tackle the problem and encourage further developments.