Biological pathway data integration through enhancing BioPAX models

From Master Projects
Jump to: navigation, search

Biological pathway data integration through enhancing BioPAX models
status: finished
Student name: student name::Michiel van Ooijen
Start start date:=2014/02/16
End end date:=2014/08/16
Supervisor: Anton Feenstra
Second supervisor: Reza Haydarlou
Company: has company::VU

Signature supervisor



There are more than 300 databases where biological pathway data has been stored in various data formats. To increase uniformity of pathway data from different sources, make biological pathway data exchangeable, and increase the efficiency of computational pathway research, a community of researchers have defined BioPAX (Biological Pathway Exchange): a semantic-web based standard language to represent biological pathways at the molecular and cellular level.

We are designing and developing an agent-based operational semantics for BioPAX models. In this way, biological interactions represented in BioPAX become alive and we can follow the flow of transitions. To achieve this goal, we need to enhance the existing BioPAX models which are automatically generated by pathway databases.

What needs to be done?

The existing BioPAX models lack two types of information: (1) information about binding of external ligands to receptors, and (2) information about binding of transcription factors to DNA and expression of gens. In this project, we need to parse BioPAX models, find receptors and transcription factors (based on annotations found in a protein database), and automatically add the missing information to the BioPAX models (retrieved from ligand and transcription factor databases).


  • Interest in biological pathways
  • Java programming skills

Work Packages

  • Acquiring knowledge about biological pathways and pathway databases such as Reactome, KEGG, and Netpath.
  • Practicing with RDF, OWL, and BioPAX models in Protege (semantic web editor).
  • Exploring API’s of ligand/transcription factor databases to retrieve missing information.
  • Using PAXTools library for reading and writing BioPAX models.
  • Enhancing BioPAX models with the information from ligand/transcription factor databases.

Further reading