Biological pathway data integration through enhancing BioPAX models
|Biological pathway data integration through enhancing BioPAX models|
|Student name:||student name::Michiel van Ooijen|
|Second supervisor:||Reza Haydarlou|
There are more than 300 databases where biological pathway data has been stored in various data formats. To increase uniformity of pathway data from different sources, make biological pathway data exchangeable, and increase the efficiency of computational pathway research, a community of researchers have defined BioPAX (Biological Pathway Exchange): a semantic-web based standard language to represent biological pathways at the molecular and cellular level.
We are designing and developing an agent-based operational semantics for BioPAX models. In this way, biological interactions represented in BioPAX become alive and we can follow the flow of transitions. To achieve this goal, we need to enhance the existing BioPAX models which are automatically generated by pathway databases.
What needs to be done?
The existing BioPAX models lack two types of information: (1) information about binding of external ligands to receptors, and (2) information about binding of transcription factors to DNA and expression of gens. In this project, we need to parse BioPAX models, find receptors and transcription factors (based on annotations found in a protein database), and automatically add the missing information to the BioPAX models (retrieved from ligand and transcription factor databases).
- Interest in biological pathways
- Java programming skills
- Acquiring knowledge about biological pathways and pathway databases such as Reactome, KEGG, and Netpath.
- Practicing with RDF, OWL, and BioPAX models in Protege (semantic web editor).
- Exploring API’s of ligand/transcription factor databases to retrieve missing information.
- Using PAXTools library for reading and writing BioPAX models.
- Enhancing BioPAX models with the information from ligand/transcription factor databases.
- Pathway browser of the Reactom database: http://www.reactome.org/PathwayBrowser/
- Pathway browser of the KEGG database: http://www.kegg.jp/kegg-bin/get_htext?query=04010&htext=br08901.keg&option=-a
- BioPAX translator for KEGG pathways: http://www.cogsys.cs.uni-tuebingen.de/software/KEGGtranslator
- Pathway browser of the NetPath database: http://www.netpath.org
- Pathway map browser of the NetPath database: http://www.netpath.org/netslim
- Collection of pathways from public databases: http://www.pathwaycommons.org/
- Protein database: http://www.uniprot.org/uniprot
- OWL Tutorial: http://220.127.116.11/tutorials/protegeowltutorial/resources/ProtegeOWLTutorialP4_v1_3.pdf
- Ontology editor: http://protege.stanford.edu/
- BioPAX community explains BioPAX in Nature Biotechnology: http://www.nature.com/nbt/journal/v28/n9/full/nbt.1666.html
- BioPAX syntax and semantics: http://www.biopax.org/release/biopax-level3-documentation.pdf
- BioPAX community explains PAXTools in Computational Biology : http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003194
- How to use PAXTools: http://sourceforge.net/projects/biopax/files/paxtools/paxtools.pdf
- The BioPAX Validator: http://bioinformatics.oxfordjournals.org/content/29/20/2659.full
- Pathway Commons Web Services: http://www.pathwaycommons.org/pc2/
- Browser for 'Gene Ontology' terms and annotations: http://www.ebi.ac.uk/QuickGO
- A database for transcription factors and their target gens: http://itfp.biosino.org/itfp