Performance prediction of cloud nodes for HPC workloads

From Master Projects
Jump to: navigation, search

About Performance prediction of cloud nodes for HPC workloads


While commodity operating systems provide information such as processor information and amount of memory to user applications, performance related information such as IO/memory latency or bandwidth is typically unknown. When scheduling HPC tasks on the cloud, we are often faced the question of which nodes are best fitted for the workload under some budget constraint. One way to answer this question is running a sample of the given tasks on various nodes and come up with a comparison [1]. Unfortunately, given the variety of VM types (e.g. 15 on Amazon EC2) as well as IaaS providers, this solution does not scale.

Another possibility is predicting the performance of a given workload on various nodes. The idea is running a sample of the user's workload somewhere to find out its sensitive performance characteristics and matching it with suitable VM types. For example, if it is clear that the user's workload is CPU-bound, then it only makes sense to consider VM types with faster CPUs. During the course of this thesis the student will answer the following questions:

1) What are the most important performance characteristics of a virtual machine for various HPC workloads? 2) How to gather performance information of various VM types? 3) Given performance information of different VM types and characteristics of the user's workload, what is the best strategy to allocate VMs?

Requirements: performance profiling, interest in cloud technologies, resource scheduling

[1] Ana-Maria Oprescu, Thilo Kielmann, Haralambie Leahu. Budget Estimation and Control for Bag-of-Tasks Scheduling in Clouds. Parallel Processing Letters, Vol. 21, No. 2, pp. 219-243, June 2011.