LinkEngine: a platform for aggregated scientific contributions on the Web

From Master Projects
Jump to: navigation, search


About LinkEngine: a platform for aggregated scientific contributions on the Web

Bioinformatics, Cognitive Science, Computational Intelligence and Selforganisation, Computer Science and Communication, Computer Systems and Security, Formal Methods and Software Verification, High Performance Distributed Computing, Human Ambience, Information Sciences, Information and Communication Technology, Internet and Web Technology, Knowledge Technology and Intelligent Internet Applications, Multimedia, Parallel and Distributed Computer Systems, Software Engineering, Systems Biology, Technical Artificial Intelligence,|, |xXx|project within::xXx|,}}


Description

LinkEngine: a platform for aggregated scientific contributions on the Web

Today, thanks to the Internet and the Web, it is really easy to access the sum of human knowledge in a matter of seconds. But, considering the vastness of the information provided, a selective approach for finding relevant scientific contributions is needed by scientists, researchers and students alike. There are a number of websites across the world that offer various services for scientists to publish their work, but while some offer services for social academic networking or data publishing, a platform where aggregated information can be found would provide a more complete context on a certain scientific contribution. Moreover, the use of DOIs [1] and ORCHID identifiers [2] for scientists makes it easier to refer to the authors and their scientific contributions in a unique and persistent way. In this context, an engine that is able to integrate and aggregate the information offered by some of the most used scientific publishing platforms is a much needed addition.

Research question: ​ Can we create a platform that is able to find and retrieve scientific contributions and automatically link them together by using existing public repositories and data provided by publishers or third parties?

Tools and techniques

LinkEngine is platform that would be able to integrate and aggregate the information available about a certain scientific contribution. This platform will act as a link engine where various public repositories and data provided by publishers or third parties can be queried to gather as much information as possible about a certain scientific contribution. A scientific contribution can be the text of an article, the dataset used or referred to in a publication, the source code used to obtain results or the multimedia objects (image, figures, slides, videos, etc.) related to the scientific contribution. An algorithm used for ranking will be used by the link engine and will ensure that the scientific contributions that are most relevant (or that are considered relevant by the algorithm) are shown first. The link engine will gather data, transform it into one representation (that uses a Linked Data model) and output it in various formats like N-Quads or JSON(-LD). The link engine will be easily accessible by means of a web interface and an API. A search request to the link engine will trigger further requests to the platforms that are to be considered for linking by gathering information for the respective scientific contribution. These underlying requests will consider the different ways the request needs to be made for the various platforms used, while the final result will be an aggregation of these results and will include the links to the original resources on the queried platforms.

The platforms considered for data gathering, linking and enrichment for a scientific contribution:

  • Data repositories: ​ arXviv, ​CrossRef, ​PLOS, ​PubMed, ​Springer, ​ Nature (see the simplified model of an article with linked data here [1])
  • Code repositories: ​ github.com
  • Scientific community websites: ​ academia.edu, ​ ResearchGate, ​ Mendeley
  • Others: ​FigShare, ​ DataDryad, ​ datahub.io, ​ SlideShare

Not all platforms have APIs available, while some need special API keys or development tokens to be accessed. How and which of these platforms will be included needs to be established, together with a test article dataset.

References