Exercise 13
Variance of a vector
If we have a vector \(X\) containing \(n\) values, then the unbiased sample variance \(s^{2}\) is:
\begin{equation} s^{2} = \frac{1}{n-1} \sum_{i=1}^{n} \left( X_{i} - \bar{X} \right)^{2} \end{equation}where \(\bar{X}\) is the mean of the vector.
In MATLAB, Python (NumPy) and R there are built-in functions for
computing the variance of a vector. Conveniently, in all three
languages the function is called var()
.
In MATLAB:
x = [2,1,5,4,8,3,4,3]; var(x) ans = 4.5000
In Python / NumPy: (started using ipython --pylab
)
x = array([2,1,5,4,8,3,4,3], 'float') var(x, ddof=1) Out[2]: 4.5
Note above for the Python / NumPy function var()
the default is for
the function to compute the biased variance, that is, using a
denominator equal to \(n\) not \(n-1\). To force the unbiased variance we
have to pass the optional argument ddof
and set it to 1.
In R:
x <- c(2,1,5,4,8,3,4,3) var(x) [1] 4.5
from scratch
Write a function called myvar()
that computes the unbiased variance
of a list of numbers. Do it from scratch, in other words don't use the
built-in functions var()
or mean()
.