Correlation

You will find some data in linregdata.csv. The file is a .csv (comma-separated-values) file containing three columns: Income (dollars), Age (years) and Education (years). There are 10 observations (rows).

Estimate the correlation between Income and Age by calculating Pearson’s r.

Also report the p-value for the significance test of the correlation coefficient. The p-value gives the probability of obtaining a value r as large as the one obtained from the sample, under the null hypothesis that the true value of r in the population is in fact r=0.

Bonus challenges

  1. Without using any built-in statistics functions, write code to compute Pearson’s r from scratch.
  2. Compute the p-value using a permutation test.
  3. Estimate confidence intervals for Pearson’s r using a bootstrap.