Interacting with Events in New York Times newspaper corpus

From Master Projects
Jump to: navigation, search


About Interacting with Events in New York Times newspaper corpus

  • This project has not yet been fulfilled.
  • This project fits in the following Bachelor programs: {{#arraymap:|, |xXx|bachelorproject within::xXx|,}}
  • This project fits in the following masterareas: {{#arraymap:Multimedia, Internet and Web Technology, AI and Communication, Cognitive Science, Knowledge Technology and Intelligent Internet Applications, Information and Communication Technology, Computer Science and Communication, Information Sciences|, |xXx|project within::xXx|,}}


Description

This assignment will be done in the context of an informal collaboration with the research team at the New York Times archive. You will explore the use of events models and event extraction techniques in order to identify the events in the corpus and link them in relevant and interesting ways. Different focus points are possible within the context of this project:

  • for students with User Interface interest: focus on event visualization as part of the search results
  • for students with AI interests: focus on dealing with clustering and similarity between events
  • for students with NLP interests: focus on the extraction of events

You can read more about rNews, schema.org standards and Linked Open Data that NYT currently explores

Some more related material on event extraction:

Tasks

  • explore the domain of event annotation, extraction, modeling
  • propose a workflow on how to use events in order to browse the NYT collection
  • define evaluation scenario
  • evaluate and compare with existing solutions

Tools and Data

In this project you will be working with the following data:

  • 20 years of NYT articles, full text and RDF as well

Recommended prior knowledge

  • Knowledge & Media course
  • Social Web course
  • Research methods course

Extra Information

Contact Lora_Aroyo for more information about this project.