A scheduling framework for large-scale, parallel, and topology-aware applications

Valentin Kravtsov, Pavel Bar, David Carmeli, Assaf Schuster, Martin Swain

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

Scheduling of large-scale, distributed topology-aware applications requires that not only the properties of the requested machines be considered, but also the properties of the machines’ interconnections. This requirement severely complicates the scheduling process, as even a matching between a single multi-processor task and available machines in a single time slot becomes an NP-complete problem with no polynomial approximation. In this paper we propose a complete scheduling framework for multi-cluster, heterogeneous environments that provides, in practice, an efficient solution for the scheduling of topology-aware applications. The proposed framework is very flexible as it is composed of pluggable components and can be easily configured to support a variety of scheduling policies. We also describe three novel scheduling and coallocation algorithms that were developed and plugged into the framework. The proposed scheduling framework was integrated into the QosCosGrid1 system, where it is used as the main decision-making module.
Original languageEnglish
Pages (from-to)983-992
Number of pages10
JournalJournal of Parallel and Distributed Computing
Volume70
Issue number9
Early online date26 May 2010
DOIs
Publication statusPublished - Sept 2010

Keywords

  • QosCosGrid
  • Grid
  • Supercomputer
  • Scheduler

Fingerprint

Dive into the research topics of 'A scheduling framework for large-scale, parallel, and topology-aware applications'. Together they form a unique fingerprint.

Cite this