Scheduling Heuristics For Executing Scientific Workflows On Homogeneous Clusters With Globallyand Locally-Accessible Persistent Storage

Date
2018-08
Authors
Pandey, Suraj
Contributor
Advisor
Department
Computer Science
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
Many applications in science and engineering today are structured as scientic workows, i.e., task graphs with data dependencies between graphs, where tasks are implemented as standalone executables and data dependencies are via les read/written from/to stable storage. For many relevant application domains, these workows are both large and data-intensive. Therefore, optimizing data accesses is crucial for ecient scientic workow executions. Typical HPC (High Performance Computing) platforms used to run scientic workows are commodity clusters, in which each compute node has access to private, small, highbandwidth \local" storage, and to shared, large, low-bandwidth \global" storage. To date, production Workow Management Systems (WMs), software infrastructures for executing workows in practice, only use global storage. There is thus an opportunity to improve workow performance by exploiting local storage. The diculty, however, is twofold. First, the capacity of local storage is limited and often allows holding only a few workow les. Second, storing data in local storage reduces parallelism because storage is private to a single node. In this thesis, we design scheduling heuristics to orchestrate workow execution in this context, with the objective of minimizing workow execution time. These heuristics decide which les should be stored in which level of storage (local or global) and replicate tasks so as to increase the availability of data across compute nodes and thus maintain parallelism. We implement a simulation framework to evaluate and drive the design of these heuristics using both real-world and synthetic workow congurations. We also implement a software prototype for using these heuristics on HPC platforms. From experimental results obtained in simulation and on an actual HPC cluster we are able to evaluate the relative merit of our heuristics and draw conclusions about the most promising approaches and remaining challenges.
Description
Keywords
Scientic Workows, DAG, Task Replication, Local/Global Storages, Heuristics
Citation
Extent
Format
Geographic Location
Time Period
Related To
Table of Contents
Rights
All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.