ANCOVA

The analysis of covariance (ANCOVA) model includes a continuous variable \(X\) (called a covariate that is typically correlated with the dependent variable.

\[ Y_{ijk} = \mu + \beta X_{ij} + \alpha_{j} + \epsilon_{ijk} \]

Usually values of the covariate are collected, or available, prior to any experimental manipulation. This ensures that values of the covariate are independent of the experimental factors. If they are not independent, the logic of ANCOVA falls apart.

Including a covariate in the model results in two desirable effects:

  1. The error term (\(\epsilon_{ijk}\) in the Equation above) is reduced, resulting in a more powerful statistical test. Variance in the dependent variable that can be accounted for by scores on the covariate is removed from the error term.

  2. The estimate of the slope of the relationship between the covariate and the dependent variable (\(\beta\) in Equation~ above) allows you to numerically adjust scores on the dependent variable, and ask the (often useful) question:

To evaluate the effect of the treatment factor we would compare the following restricted and full models:

\[ \begin{eqnarray} full: &Y_{ijk} &= \mu + \beta X_{ij} + \alpha_{j} + \epsilon_{ijk}\\ restricted: &Y_{ijk} &= \mu + \beta X_{ij} + \epsilon_{ijk} \end{eqnarray} \]

Notice how the covariate term \(\beta X_{ij}\) is present in both models. We are asking: is there an effect of the treatment factor (is the \(\alpha_{j}\) parameter zero) after we have removed the variance accounted for by the covariate?

Notice also how in the equation for the full model the \(\alpha_{j}\) and \(\beta\) parameters can be regarded as an intercept and slope, respectively, of the straight line relating the covariate \(X\) to the dependent variable \(Y\). Let’s look at this graphically.

Imagine we have data from 10 subjects, 5 in each of two groups (e.g. control and experimental). We have scores on the dependent variable \(Y\), and we also have, for each subject, scores on a covariate \(X\). Let’s say for the sake of an example, we’re testing whether a new drug is effective at boosting memory. The two groups are placebo and drug, and the dependent variable is number of items recalled on a memory test. Let’s say we know from previous research that IQ scores are predictive (are correlated with) memory, and so we also have IQ scores for each of our 10 subjects.

We can plot the data in a particular way to illustrate the principles of ANCOVA. We will plot a point for each subject, with their score on the dependent variable on the vertical axis, their score on the covariate on the horizontal axis, and we will use different plot symbols to denote membership in the placebo versus drug group.

memoryscore <- c(7.0, 7.7, 5.4, 10.0, 11.7, 11.0, 9.4, 14.8, 17.5, 16.3)
group <- factor(c(rep("placebo", 5), rep("drug", 5)),
    levels=c("placebo", "drug"))
iq <- c(112,111,105,115,119,116,112,121,118,123)
plot(iq,memoryscore, pch=c(rep(1,5),rep(24,5)), bg=c(rep(1,5),rep(21,5)))
legend("topleft", c("placebo","drug"), pch=c(1,24), pt.bg=c(1,21))

We can see three important things:

  1. There seems to be a difference between groups on the dependent variable. Mean memory score for the placebo group is 8.4 and for the drug group is 13.8.

  2. There is clearly a linear relationship between memory score and IQ.

  3. The two groups also seem to differ on their IQ scores. Mean IQ for the placebo group is 112.4 and for the drug group is 118.

If we were to run a simple one-way between subjects ANOVA, ignoring the covariate, we would get the following:

options(contrasts = c("contr.sum", "contr.poly"))
mod.1 <- lm(memoryscore ~ group)
a <- anova(mod.1)
print(a)
## Analysis of Variance Table
## 
## Response: memoryscore
##           Df Sum Sq Mean Sq F value  Pr(>F)  
## group      1 73.984  73.984  8.1043 0.02158 *
## Residuals  8 73.032   9.129                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We would see that the omnibus test of the effect of group is significant at the 0.05 level (p = 0.022). Make a note of the within-group variance. We can see that the error term in the ANOVA output (the Mean Sq value for the Residuals) is 9.12.

We can visualize it this way (see below): the horizontal line is the mean memory score of each group. The vertical dashed lines represent the error, under the full model.

plot(iq,memoryscore, pch=c(rep(1,5),rep(24,5)), bg=c(rep(1,5),rep("red",5)))
legend("topleft", c("placebo","drug"), pch=c(1,24), pt.bg=c(1,"red"))
for (j in 1:2) {
    if (j==1) {
        w <- c(1,2,3,4,5)
        cl <- 1
    }
    if (j==2) {
        w <- c(6,7,8,9,10)
        cl <- "red"
    }
    lines(c(min(iq[w]),max(iq[w])), c(mean(memoryscore[w]),
        mean(memoryscore[w])), col=cl)
    for (i in 1:length(w)) {
        lines(c(iq[w[i]],iq[w[i]]), c(memoryscore[w[i]],
            mean(memoryscore[w])), lty=2, col=cl)
    }
}

The idea of ANCOVA then is to account for, or represent, the variation in memory score that can be captured by variation in IQ. This is done by modeling the relationship between memory score and IQ as a straight line. We can run the ANCOVA in the following way (see code snippet below), which is sort of neat, because we can explicitly state the full and restricted models as linear models using the lm() function, and then perform an F-test using the anova() function to compare each model. I also show below that you can just use the anova() function on the full model alone, as well.

mod.full <- lm(memoryscore ~ iq + group)
mod.rest <- lm(memoryscore ~ iq)
a2 <- anova(mod.full, mod.rest)
print(a2)
## Analysis of Variance Table
## 
## Model 1: memoryscore ~ iq + group
## Model 2: memoryscore ~ iq
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1      7 20.913                           
## 2      8 31.273 -1    -10.36 3.4678 0.1049
a3 <- anova(mod.full)
print(a3)
## Analysis of Variance Table
## 
## Response: memoryscore
##           Df  Sum Sq Mean Sq F value    Pr(>F)    
## iq         1 115.743 115.743 38.7412 0.0004349 ***
## group      1  10.360  10.360  3.4678 0.1048703    
## Residuals  7  20.913   2.988                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Notice how much lower our error term is (Mean Sq Residuals = 2.988) than before when we ran the ANOVA without the covariate. Also notice that the effect of group is no longer significant! For the test of the effect of group, p = 0.105.

Graphically we can represent the ANCOVA full model by plotting the line corresponding to each pair of \((\alpha_{j},\beta)\) for each group. Remember \(\alpha_{j}\) is the intercept and \(\beta\) is the (common) slope.

plot(iq,memoryscore, pch=c(rep(1,5),rep(24,5)), bg=c(rep(1,5),rep("red",5)))
legend("topleft", c("placebo","drug"), pch=c(1,24), pt.bg=c(1,"red"))
mod.full.coef <- coef(mod.full)
print(mod.full.coef)
## (Intercept)          iq      group1 
## -50.7033113   0.5363135  -1.2183223
placebo.pred <- predict(mod.full)
for (j in 1:2) {
    if (j==1) {
        w <- c(1,2,3,4,5)
        cl <- 1
    }
    if (j==2) {
        w <- c(6,7,8,9,10)
        cl <- "red"
    }
    lines(c(min(iq[w]),max(iq[w])),
        c(min(placebo.pred[w]), max(placebo.pred[w])), col=cl)
}

The coef() function gives us the model coefficients for mod.full, which is our full model (the output of the lm() function applied to our full model formula above). The iq value (0.536) is the estimate of the slope of the lines, \(\beta\) in our equations above. The Intercept value (-50.703) is the value of \(\alpha_{1}\), the intercept for the first level of the group factor, i.e. placebo. This is tricky and is an R gotcha: the group1 value (-1.218) is not equal to \(\alpha_{2}\), but is what you have to add to \(\alpha_{1}\) in order to get the estimate of \(\alpha_{2}\). Thus $_{2} = $ -50.703 + 0.536 = -50.167.

Now, the test for the effect of group, is essentially a test of whether a full model that contains separate intercepts, \(\mu + \alpha_{j}\), one for each level of the group factor (placebo, drug), is significantly better than a restricted model that only contains one intercept (in this case, since \(\alpha_{j}\) is missing from the model, so the intercept is simply \(\mu\) and since it doesn’t have a subscript j, is the same for each level of the group factor).

In the example above, the intercepts of the two lines corresponding to the full model (the values of \(\alpha_{1}\) and \(\alpha_{2}\)) are similar enough to each other that the restricted model probably fits the data just as well (indeed it does, according to the F-test we performed above).

plot(iq,memoryscore, pch=c(rep(1,5),rep(24,5)), bg=c(rep(1,5),rep("red",5)))
legend("topleft", c("placebo","drug"), pch=c(1,24), pt.bg=c(1,"red"))
for (j in 1:2) {
    if (j==1) {
        w <- c(1,2,3,4,5)
        cl <- 1
    }
    if (j==2) {
        w <- c(6,7,8,9,10)
        cl <- "red"
    }
    lines(c(min(iq[w]),max(iq[w])),
        c(min(placebo.pred[w]), max(placebo.pred[w])), col=cl)
    for (i in 1:length(w)) {
        lines(c(iq[w[i]],iq[w[i]]),
            c(memoryscore[w[i]], placebo.pred[w[i]]), lty=2, col=cl)
    }
}

Adjusted Means

Having run an ANCOVA, and having estimated the relationship between IQ and memory score, we are now in a position to answer the interesting question, What would memory scores have been for placebo and drug groups IF their scores on IQ had been similar?. Of course their scores on IQ were not similar, but ANCOVA allows us to answer the question, IF they had been similar, then taking account what we know about how memory score varies with IQ (independent of any experimental manipulation, like placebo vs drug), what would their scores have been?

We are going to ask R to generate predicted values of memory score given the full model that includes both \(\alpha_{j}\) and \(\beta\) parameters. We will then ask R to generate predictions of what memory score would have been, for both placebo and drug groups, if their scores on IQ had been the same. We will have to choose some arbitrary value of IQ, the usual is to simply choose the grand mean of IQ across both groups. We will use the predict() command to generate predictions from our full model.

group.predict <- factor(c("placebo","drug"), levels=c("placebo","drug"))
iq.predict <- rep(mean(iq), 2)
data.predict <- data.frame(group=group.predict, iq=iq.predict)
print(data.predict)
##     group    iq
## 1 placebo 115.2
## 2    drug 115.2
adjmeans <- predict(mod.full, data.predict)
data.predict = data.frame(data.predict, memoryscore=adjmeans)
print(data.predict)
##     group    iq memoryscore
## 1 placebo 115.2    9.861678
## 2    drug 115.2   12.298322

So we see that our full model predicts that if the placebo and drug groups had been equal on IQ (for both groups, IQ = 115.2), then the mean memory score for the placebo group would have been 9.86 and for the drug group would have been 12.3. Compare that to the original unadjusted means (placebo = 8.36 and drug = 13.8) and you see that in this case, arithmetically adjusting for the group differences in IQ, resulted in less of a difference in memory score.

Graphically we can see how the prediction works, by drawing a vertical line at IQ = 115.2 and determining the value of memoryscore that intercepts that line for each group.

plot(iq,memoryscore, pch=c(rep(1,5),rep(24,5)), bg=c(rep(1,5),rep("red",5)))
legend("topleft", c("placebo","drug"), pch=c(1,24), pt.bg=c(1,"red"))
for (j in 1:2) {
    if (j==1) {
        w <- c(1,2,3,4,5)
        cl <- 1
    }
    if (j==2) {
        w <- c(6,7,8,9,10)
        cl <- "red"
    }
    lines(c(min(iq[w]),max(iq[w])),
        c(min(placebo.pred[w]), max(placebo.pred[w])), col=cl)
    abline(v=mean(iq), lty=3, col="blue")
    abline(h=adjmeans[1], lty=2, col="black")
    abline(h=adjmeans[2], lty=2, col="red")
}