An evaluation of swish for the transformation phase in big data analytics

status: ongoing
Master: Information Sciences
Student name: Aron Vries
Start date: 2015/01/21
End date: 2015/07/01
Supervisor: Jan Wielemaker
Second supervisor: Willem van Hage
Company: VU
Thesis: has thesis::Media:Thesis.pdf
Poster: has poster::Media:Posternaam.pdf

Big data analytics is roughly divided into three phases. The first phase is the pre-processing. In this phase the data is preprocessed to a (relational) model. The second phase is the transformation phase. In this phase data is combined, filters are applied and transformation on values in the data. The last phase is the processing e.g. the visualization of the data. The first and second phase are mostly covered by modern technologie. The second phase (transformation) is the most time consuming phase right now. In this phase data is combined with other data (often from external data sources). Different filters (may) apply on the data and the data will be transform.

This research project will be an evaluate of the tool swish (with CQL) to see if there is improvement in performance and maintainability of the transformation process.