Domain-Independent Quality Measures for Crowd Truth Disagreement

From Master Projects
Jump to: navigation, search

has title::Domain-Independent Quality Measures for Crowd Truth Disagreement
status: finished
Master: project within::Multimedia
Student name: student name::Oana Inel
Start start date:=2013/03/01
End end date:=2013/08/30
Supervisor: Lora Aroyo
Second supervisor: Robert-Jan Sips
Second reader: has second reader::Chris Welty
Company: has company::IBM
Thesis: has thesis::Media:thesis_OanaInel.pdf
Poster: has poster::Media:Posternaam.pdf

Signature supervisor



Using crowdsourcing platforms such as CrowdFlower and Amazon Mechanical Turk for gathering human annotation data has become now a mainstream process. Such crowd involvement can reduce the time needed for solving an annotation task and with the large number of annotators can be a valuable source of annotation diversity. In order to harness this across domains it is critical to establish a common ground for quality assessment of the results. In this research we report on our experiences for optimizing and adapting crowdsourcing micro-tasks across domains considering three aspects: (1) the micro-task template, (2) the quality measurements for the workers judgments and (3) the overall annotation workflow. We performed experiments in two domains, i.e. events extraction (MRP project) and medical relations extraction (Crowd-Watson project). The results confirmed our main hypothesis that some aspects of the evaluation metrics can be defined in a domain-independent way for micro-tasks that assess the parameters to harness the diversity of annotations and the useful disagreement between workers. The current research focuses specifically on the parameters relevant for the ’event extraction’ ground-truth data collection and demonstrates their reusability from the medical domain.