are they from two populations with different means?
or are they from one or more populations with the same mean?
(and sample differences are due to random chance?)
# A tibble: 12 × 2
x g
<dbl> <fct>
1 67 control
2 103 control
3 109 control
4 74 control
5 93 control
6 106 control
7 115 treatment
8 124 treatment
9 130 treatment
10 120 treatment
11 138 treatment
12 126 treatment
g x
1 control 92.0
2 treatment 125.5
T-test
what is the probability of taking two samples of size N=6 from population(s) with the same mean and observing a difference in means as large as the one observed?
this is the probability of observing the data under H_{0}, the null hypothesis
# A tibble: 12 × 2
x g
<dbl> <fct>
1 67 control
2 103 control
3 109 control
4 74 control
5 93 control
6 106 control
7 115 treatment
8 124 treatment
9 130 treatment
10 120 treatment
11 138 treatment
12 126 treatment
g x
1 control 92.0
2 treatment 125.5
ANOVA: N groups
H_{0}: groups sampled from population(s) with the same mean
H_{1}: groups not sampled from populations(s) with the same mean
(i.e. at least one group was sampled from a population with a different mean)
p-value: what is the probability of observing differences between groups as large as the ones observed, if H_{0} is true?
if we sampled three groups of size N=4 from population(s) with the same mean?
g x
1 treatment1 98.50
2 treatment2 107.75
3 treatment3 123.00
ANOVA
ANOVA stands for ANalysis Of VAriance
ANOVA is a statistical test that compares the means of two or more groups
many forms of ANOVA exist, but we will start with the simplest:
one-way between-subjects ANOVA
(read Navarro, chapter 14)
one-way between-subjects ANOVA
one-way: one independent variable
later we will see two-factor ANOVA and n-factor ANOVA
between-subjects: each participant contributes an observation in only one group
later we will see within-subjects ANOVA and mixed ANOVA
Omnibus F-test
ANOVA computes an “omnibus” F-statistic, which is a ratio of two variances
(omnibus means “overall”)
the numerator is the between-groups variance
the denominator is the within-groups variance
Omnibus F-statistic is a metric of the “overall” question:
“are the means of the groups the same? (or not the same)?”
Omnibus F-test
Omnibus F-test
F = \frac{\mathrm{BetweenVariance}}{\mathrm{WithinVariance}}
what is F going to be?
F-ratio far above 1.0: between-groups variance is larger than within-groups variance
Omnibus F-test
F = \frac{\mathrm{BetweenVariance}}{\mathrm{WithinVariance}}
what is F going to be?
F-ratio below 1.0: between-groups variance is smaller than within-groups variance
Omnibus F-test
null hypothesis is that the population means of all groups are equal
H_{0}: each group was sampled from population(s) with the same mean
H_{1}: at least one group was sampled from a population with a different mean
the p-value for the omnibus F-test is the probability of observing an F-statistic as large as the one computed, assuming that the null hypothesis is true
Distribution of F under H_{0}
under the null hypothesis, groups are sampled from population(s) with the same mean
but random sampling results in differences between sample groups
the F-statistic is an overall (omnibus) measure of the differences between all groups
under the null hypothesis we expect the F-statistic to be close to 1.0 most of the time
but due to random sampling, under the null hypothesis, sometimes it will be larger
the p-value tells us how likely is it to observe a given F-statistic under H_{0}
Distribution of F under H_{0}
Distribution of F under H_{0}
Distribution of F under H_{0}
Distribution of F under H_{0}
Omnibus F-test
following a significant omnibus F-test, we can perform follow-up tests to determine which groups differ from each other
not this week—we will cover follow-up tests next week
If the omnibus F-test is not significant, we should stop
Omnibus F-test protects us from making more Type I errors than we want
more about this next week
ANOVA Table/Formulas
\mathrm{SS_{b}} = sum of squares between
each group mean minus the grand mean of all groups
\mathrm{SS_{w}} = sum of squares within
each observation minus the group mean to which it belongs
ANOVA Table/Formulas
read Navarro, chapter 14, for a worked example, going from the raw data to the ANOVA table