Enriching Sound and Vision television archives

From Master Projects
Jump to: navigation, search

About Enriching Sound and Vision television archives

  • This project has not yet been fulfilled.
  • This project fits in the following Bachelor programs: {{#arraymap:|, |xXx|bachelorproject within::xXx|,}}
  • This project fits in the following masterareas: {{#arraymap:Information and Communication Technology, Internet and Web Technology, AI and Communication, Technical Artificial Intelligence, Knowledge Technology and Intelligent Internet Applications, Information Sciences|, |xXx|project within::xXx|,}}
  • Project website: has projectpage:: www.beeldengeluid.nl


Sound and Vision (‘Beeld en Geluid’) has one of the largest audiovisual archives in Europe. The institute manages over 70 percent of the Dutch audiovisual heritage. The collection contains more than 750.000 hours of television, radio, music and film from the beginning in 1898 until today.

We are constantly researching ways of improving access to this collection. In the 'Term Extraction' project, we used Natural Language Processing technologies to automatically extract topics from subtitles of programs. These are then added to the metadata in the Sound and Vision database. However, much improvement to this algorithm is still possible.

In this project, a student would investigate opportunities for improving the quality of the term extraction process. More specifically, the student would investigate using background information in the form of (Linked) thesauri and vocabularies and combine this with natural language processing techniques.

This project can be done as an internship at Sound and Vision or as a project at VU in collaboration with Sound and Vision.