# Statsclubproblems

## Problem descriptions

#### Wojtek

Hereby a compilation of my slides that I was using during the first two meetings. Pay special attention to slides 19-21 - they contain tasks for some of you (Robert-Jan, Willem, Rob, Evert, Vincent) and give an idea of what to expect on Monday 21st (at 14:00).

#### Selmar/Gusz

The dataset consists of 9 subsets of experimental data. In each subset the results are given of 100 runs of a specific EA on a specific problem-function. There are three different EAs, and three different functions: Rastrigin, Sphere and a handcrafted stepped function.

For each run the following metrics are saved:

• Best fitness in the final population
• The number of problem evaluations needed to find the solution
• If the run terminated succesfully

The question is: Which EA is the best?

#### Martijn

example results

We investigate effects of reciprocity and transivity in network formation. The dataset consists of 3 X 4 X 5 subsets over three dimensions:

• frequency_of_informal_opportunities \in [LOW, MEDIUM, HIGH]
• [operational_transitivity, operational_reciprocity] \in [[no,no], [yes,no],[no,yes],[yes,yes]]
• specialisation \in [admin, electro, nautical, technical, marines]

We measured number of reciprocated ties.

The question is: are the observed differences significant? Or: what dimension has largest impact on number of reciprocated ties?

I have the dataset, but not readily available to include here.

#### Willem

This is a problem I had with a previous paper.

Stripped down, my problem comes down to the following: I have 3 algorithms: 2 benchmarks and 1 new algorithm. I want to show that my new algorithm outperforms the other algorithms. The output of the algorithms is a single number, the 'value'.

The goal of the algorithm is to locate (static) targets, who are distributed over some terrain. I took 10 different target distributions, and tested each algorithm on these distributions, with 10 different initial random seeds.

The problem here is that the random seed makes a lot of difference in how well the algorithms perform. This means that the mean performance of an algorithm over different random seeds doesn't give me much information. But, I can compare the outcomes of different algorithms using the same initial random seed.

So, at each run, I computed the difference in value for new algorithm vs. the two benchmarks. To show that my algorithm outperforms the other algorithms, I now only have to show that these differences are significantly higher than 0. I did this using the wilcoxon signed rank test.

Attached you will find 2 data sets and 2 plots. The first data file contains the difference between the new algorithm and the first benchmark, the second date file contains the difference between the new algorithm and the second benchmark. (the original outcomes of the algorithms are on a different computer than i'm on right now, so I cannot send you these.) Each row in the .dat files are the results for one target distribution.

The two .pdf files are the plots for these differences. If you look at them, you intuitively see that the the differences are generally higher than 0. But, what is the best test to show this?

dumb_vs_smart.dat (plain text) dumb_vs_smart.pdf (pdf) det_vs_smart.dat (plain text) det_vs_smart.pdf (pdf)