Difference between revisions of "WikiBench: A distributed, Wikipedia based web application benchmark"

From Master Projects
Jump to: navigation, search
Line 1: Line 1:
|Master name=Internet and Web Technology
|Master name=Internet and Web Technology
|Student name=Erik-Jan van Baaren
|Student name=Erik - Jan van Baaren
|Student number=0
|Student number=0
|Project start date=2008/09/01
|Project start date=2008/09/01

Latest revision as of 07:13, 15 October 2010

has title::WikiBench: A distributed, Wikipedia based web application benchmark
status: finished
Master: project within::Internet and Web Technology
Student name: student name::Erik - Jan van Baaren
number: student number::0
Start start date:=2008/09/01
End end date:=2009/05/01
Supervisor: Guillaume Pierre
Second reader: has second reader::Guido Urdaneta
Poster: has poster::Media:Media:Posternaam.pdf

Signature supervisor



Many different, novel approaches have been taken to improve throughput and scalability of distributed web application hosting systems and relational databases. Yet there are only a limited number of web application bench- marks available. We present the design and implementation of WikiBench, a distributed web application benchmarking tool based on Wikipedia. WikiBench is a trace based benchmark, able to create realistic workloads with thousands of requests per second to any system hosting the freely available Wikipedia data and software. We obtained completely anonymized, sampled access traces from the Wikimedia Foundation, and we created software to process these traces in order to reduce the intensity of its traffic while still maintaining the most important properties such as inter-arrival times and distribution of page popularity. This makes WikiBench usable for both small and large scale benchmarks. Initial benchmarks show a regular day of traffic with its ups and downs. By using median response times, we are able to show the effects of increasing traffic intensities on our system under test.

Final thesis