Predicting execution time of machine learning tasks for scheduling

Authors:
Rattan Priya;Bruno Feres de Souza;André L. D. Rossi;André C. P. L. F. de Carvalho
Affiliations:
Computer Science Engineering, Indira Gandhi Institute of Technology, GGSIPU, New Delhi, India;Computer Science Department, Insitutute of Mathematics and Computer Sciences, University of São Paulo, São Carlos-SP, Brazil;Computer Science Department, Insitutute of Mathematics and Computer Sciences, University of São Paulo, São Carlos-SP, Brazil;Computer Science Department, Insitutute of Mathematics and Computer Sciences, University of São Paulo, São Carlos-SP, Brazil
Venue:
International Journal of Hybrid Intelligent Systems
Year:
2013

Citing 20
Cited 0

Genetic algorithms + data structures = evolution programs (2nd, extended ed.)

Genetic algorithms + data structures = evolution programs (2nd, extended ed.)
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Machine Learning

Machine Learning
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing

IEEE Transactions on Parallel and Distributed Systems
A perspective view and survey of meta-learning

Artificial Intelligence Review
Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results

Machine Learning
Estimating the Predictive Accuracy of a Classifier

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Heterogeneous Computing: Goals, Methods, and Open Problems

HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
A Metascheduler For The Grid

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
On multiprocessor task scheduling using efficient state space search approaches

Journal of Parallel and Distributed Computing
Dynamically mapping tasks with priorities and multiple deadlines in a heterogeneous environment

Journal of Parallel and Distributed Computing
Measuring empirical computational complexity

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
A regression-based approach to scalability prediction

Proceedings of the 22nd annual international conference on Supercomputing
An empirical evaluation of supervised learning in high dimensions

Proceedings of the 25th international conference on Machine learning
PQR: Predicting Query Execution Times for Autonomous Workload Management

ICAC '08 Proceedings of the 2008 International Conference on Autonomic Computing
Web Mining Applications in E-commerce and E-services

Web Mining Applications in E-commerce and E-services
Wireless Sensor Networks

Wireless Sensor Networks
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
NP-complete scheduling problems

Journal of Computer and System Sciences
Metalearning: Applications to Data Mining

Metalearning: Applications to Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Lately, many academic and industrial fields have shifted their research focus from data acquisition to data analysis. This transition has been facilitated by the usage of Machine Learning ML techniques to automatically identify patterns and extract non-trivial knowledge from data. The experimental procedures associated with that are usually complex and computationally demanding. To deal with such scenario, Distributed Heterogeneous Computing DHC systems can be employed. In order to fully benefit from DHT facilities, a suitabble scheduling policy should be applied to decide how to allocate tasks into the available resources. An important step for such is to guess how long an application would take to execute. In this paper, we present an approach for predicting execution time specifically of ML tasks. It employs a metalearning framework to relate characteristics of datasets and current machine state to actual execution time. An empirical study was conducted using 78 publicly available datasets, 6 ML algorithms and 4 meta-regressors. Experimental results show that our approach outperforms a commonly used baseline method. After establishing SVM as the most promising meta-regressor, we employed its predictions to actually build schedule plans. In a simulation considering a small scale DHC enviroment, a simple Genetic Algorithm based scheduler was employed for task allocation, leading to minimized overall completion time. These achievements indicate the potential of meta-learning to tackle the problem and encourage further developments.