Selecting docking conformations based on predicted interface and interaction strength

From Master Projects
Revision as of 15:42, 29 June 2015 by Feenstra (talk | contribs) (New page: {{Masterproject |Master name=Bioinformatics |Student name=Sije van der Veen |Project start date=2015/08/31 |Project end date=2016/01/28 |Supervisor=Qingzhen Hou |Second supervisor=Anton Fe...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Selecting docking conformations based on predicted interface and interaction strength
status: ongoing
Master: project within::Bioinformatics
Student name: student name::Sije van der Veen
Start start date:=2015/08/31
End end date:=2016/01/28
Supervisor: Qingzhen Hou
Second supervisor: Anton Feenstra
Thesis: has thesis::Media:Thesis.pdf
Poster: has poster::Media:Posternaam.pdf

Signature supervisor



Past decades years the human genome has been unravelled. On the genome there are many genes which encodes for protein sequences. These sequences can have several functions and a property can be that they can interact with another protein. Compared to human genome, there is much less experimental data available on protein-protein interaction (PPI), therefore, developing predictive methods for PPIs is an interesting topic for research and development.

Computational protein-protein docking is a valuable tool for determining the conformation of complexes formed by interacting proteins. The problem here is the ranking to select the 'best' predicted bound orientation. At the IBIVU research is performed for creating methods which can predict protein interactions in order to identify stable complexes of interacting proteins based on molecular dynamics simulations. Full atomistic simulation require about a year on a single CPU per PPI which is unfeasible to apply to for example 1000 docking orientations for a single interacting protein pair. A coarse-grained forcefield can be used which brings the run time down to about 1⁄2 a day per PPI. This is still expensive, and certainly far too expensive to investigate all possible PPIs in a genome, for example the 20000 genes in the human genome may give rise to potentially 200 million interacting protein pairs.

In order to reduce the amount of computation needed, we will test whether filtering this ranking by interface information, generated by a interface site prediction, can yield accurate predictions. Interaction site predictions are quick, but not very accurate. Binding orientations which do not include the predicted interface site will be filtered out, to generate a more accurate ranking. Statistical test on the rankings will be performed to assess significance of rankings, and of changes in the rankings generated in this process.

This programming work will be done by python, with some libraries like BioPython. The R language will be useful for the statistical part of the work. Several available interface prediction methods will be used, and docking software may also be used. The start of the work will be Monday 31 august, for a period of 20 weeks.