Evaluation of multi-core scheduling mechanisms for heterogeneous processing architectures

  • Authors:
  • Håkon Kvale Stensland;Carsten Griwodz;Pål Halvorsen

  • Affiliations:
  • Simula Research Laboratory, Norway;Simula Research Laboratory, Norway and University of Oslo, Norway;Simula Research Laboratory, Norway and University of Oslo, Norway

  • Venue:
  • Proceedings of the 18th International Workshop on Network and Operating Systems Support for Digital Audio and Video
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

General-purpose CPUs with multiple cores are established products, and new heterogeneous technology like the Cell broadband engine and general-purpose GPUs bring an even higher degree of true multi-processing into the market. However, means for utilizing the processing power is immature. Current tools typically assume that exclusive use of these resources is sufficient, but this assumption will soon be invalid because the interest in using their processing power for general-purpose tasks. Among the applications that can benefit from such technology is transcoding support for distributed media applications, where remote participants join and leave dynamically. Transcoding consists of several clearly separated processing operations that consume a lot of resources, such that individual processing units are unable to handle all operations of a session of arbitrary size. The individual operations can then be distributed over several processing units, and data must be moved between them according to the dependencies between operations. Many multi-processor scheduling approaches exist, but to the best of our knowledge, a challenge is still to find mechanisms that can schedule dynamic workloads of communicating operations while taking both the processing and communication requirements into account. For such applications, we believe that feasible scheduling can be performed in two levels, i.e., divided into the task of placing a job onto a processing unit and the task of multitasking time-slices within a single processing unit. We have implemented some simple high-level scheduling mechanisms and simulated a video conferencing scenario running on topologies inspired by existing systems from Intel, AMD, IBM and nVidia. Our results show the importance of using an efficient high-level scheduler.