A sampling-based method for dynamic scheduling in distributed data mining environment

  • Authors:
  • Jifang Li

  • Affiliations:
  • Computer Science and Information Technology College, Zhejiang Wanli University, P.R. China

  • Venue:
  • WSEAS Transactions on Computers
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a new solution for dynamic task scheduling in distributed environment. The key issue for scheduling tasks is that we can not obtain the execution time of irregular computations in advance. For this reason, we propose a method which is based on sampling to some typical data mining algorithm. We argue that a function is existed in the items: execution time, the size of data and the algorithm, therefore we can deduce the execution time of a data mining task from the corresponding the size of data and algorithm. The experimental results show that almost all the algorithms exhibits quasi linear scalability, but the slope of different algorithms is different. We adopt this sampling method for process the tasks scheduling in distributed data mining environment. The experimental results also show the sampling method is applicable to task scheduling in dynamic environment and can be adopted to obtain a higher result.