Data locality scheduling of MTC applications on an in-memory file system

From Master Projects
Revision as of 13:30, 30 November 2015 by Kielmann (talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


About Data locality scheduling of MTC applications on an in-memory file system


Description

Many-task computing (MTC) applications, composed of large numbers of tasks that have to process large data amounts, are becoming more common. Recently, an attractive solution to speedup their execution is to use an in-memory file system to distribute their data over the memory of the nodes on which they are running. Such an approach stripes the data in chunks which are stored on multiple nodes and is locality-agnostic: tasks do not have all the data on the nodes on which they are running and thus they have to fetch it over the network. However, data locality, i.e. caching data and running tasks on nodes which contain their files, might still have some benefits as it can reduce the network traffic between nodes and further speed up application execution.

The goal of this project is to investigate the extent at which data locality is beneficial for MTC applications when storing their data in an in-memory file system. The student will design scheduling algorithms that consider data locality and measure their performance compared to the existing locality-agnostic approach.