# Exercise 38

## Parallel bootstrap

As we saw in Assignment 4, we can use random resampling with replacement to simulate drawing samples from a population, and then we can quantitatively assess the probability of the null hypothesis.

Assume we have the following sample of data, `x`

:

3 5 4 3 6 7 -1 2 -3 -4 2 -5 3 2 -1

and a second sample, `y`

:

2 7 4 7 6 9 -1 3 -2 -1 3 -1 2 4 2

The mean of the sample \(x\) is 1.533 and the mean of sample \(y\) is 2.933. Let's compute a statistic we'll call \(d_{samp}\) which is the difference between means, \(\bar{y}-\bar{x}\) which in this case is equal to \(d_{samp}=1.400\).

The null hypothesis is that the samples were taken from a single population with the same mean, and so \(d_{pop}=0.00\), and any departure from this in our sample statistic \(d_{samp}\) is due to random sampling error.

**Question 1**

Use bootstrapping to compute the probability of the null hypothesis, with one million bootstrap (re)samples.

**Question 2**

Rewrite your solution to question 1, to parallelize the bootstrap. In other words execute the one million bootstrap iterations in parallel, with the work spread over your available compute cores.