Statistic of the Week

Analysis of Variance - ANOVA and the F-Test

7400.685.080 Research Methods in HE/FE - - - - Inst: David D. Witt, Ph.D.

Think back to the T-Test discussion. Remember that a t-test is a test for significant differences between two means -- that is, for testing whether a difference between two means is statistically significant. For example, whether the divorce rate is higher for poor people than for rich people.

The average number of divorces for poor people and the average number of divorces for rich people are the TWO MEANS - t-testing tells us if the difference is significant or not.

The next logical question is: what do we do if we have three or more means??

What if we included additional economic groups to our study of divorce rates:
Group 1 = Rich / Group 2 = Middle Class / Group 3 = Working Class / Group 4 = Poor

We could do a series of t-tests between all the possible Groups
(i.e., 1 and 2, 1 and 3, 1 and 4, 2 and 3, 2 and 4, and 3 and 4).

There are two reasons why this is not a good idea.

1. If we run a large number of tests of significance, we can expect as a matter of chance that a certain proportion of the tests will be significant. Out of a hundred tests, the laws of chance suggest 5 will be erroneously significant.

2. In research designs with more than one independent variable, we have to acknowledge that the independent variables DO influence each other, as well as the Dependent variable.
The t-test series wouldn't account for, or "partial out" these interaction effects between independent variables.

The statistical procedure known as Analysis of Variance, used in conjunction with an F-Test, can be employed to solve the dilemmas of #1 and #2 above when we wish to make comparisons among three or more means (three or more "Independent" variables, three or more income groups!).

Here's an example of analysis of variance using schools and kids. We draw a sample of n = 6 kids from each of k = 4 schools. Each school (the independent variable) is in a different neighborhood, using different teaching techniques. Each student is measured on the S.A.T.(dependent variable) and results are:

School 1

School 2

School 3

School 4

Mean of X

15

17

17

15

School 1 = 15.00

14

17

15

17

School 2 = 18.00

15

19

13

19

School 3 = 16.50

16

19

18

18

School 4 = 17.00

17

20

18

16

Sum of Means 66.50

13

16

18

17

Grand Mean = 16.625

X1=90

Mean x1=15.00

X2=108

Mean x2=18.00

X3=99

Mean x3=16.50

X4=102

Mean x4=17.00

The only thing new here is

the Grand Mean idea.

X1=2.00

X2=2.40

X3=4.30

X4=2.00

Now - here comes the Analysis of Variance Part!!!

First, we calculate the WITHIN GROUPS VARIANCE:

w = (X1+ X2+ X3+ X4)/N (the number of groups)

By substitution we have: w = (2.0 + 2.4 + 4.3 + 2.0) / 4 = 10.7 / 4 = 2.675

Now we calculate the BETWEEN GROUPS VARIANCE:

School x = Sample mean - Grand mean x2 n x2/n
#1 15.00-16.625 = -1.625 2.64 6 15.84
#2 18.00-16.625 = 1.375 1.89 6 11.34
#3 16.50-16.625 = -0.125 0.02 6 00.12
#4 17.00-16.625 = -0.375 0.14 6 00.84
N=24 (x2/n) = 28.14

From here we can calculate the b = (x2/n) /(k-1)

b = 28.14/(4-1) = 9.38

Now we'll calculate the F-ratio: F = b/w =9.38/2.65 = 3.51

Just like the t-test, we are going to have to decide whether this particular obtained F-ratio is statistically signioficantly different from an F-ratio that could have oocurred by chance.

To do this, we need one more piece of information. Degrees of freedom (df) associated with each variance in the problem.

Between Groups degrees of freedom: dfb = (k-1) = (4-1) =3
where "k" is the number of groups.

Within Groups degrees of freedom: dfw = (N-1) = (24-4) = 20
where "N" is the total number of subjects

Find the Table of Critical Values of F (below, like the t-values table)
and look up df = 3 and df = 20 - - -

the Critical Value of F with an alpha a = .05 is 3.10

the Obtained Value of F (calculated) was ---- 3.51

Interpretation: Since our Obtained Value is bigger we have to say that there is a true difference between all the sample means - a difference that could not have happend by chance.


Click here for F-Table