Statistic of the Week
BiVariate Statistics
Contingency Tables & the Chi Square Statistic
7400.685.080 Research Methods in HE/FE                                   Inst: D. Witt
Up to now we have been looking at univariate statistics (one variable at a time). This is fine for ttests and for the assessment of skewness and kurtosis in all our variables of interest.
Now we can begin to take a look at real analysis with two variables. This is bivariate statistical analysis. (by the end of the semester, we will be doing multivariate statistics!!!).
There are several statistics that operate under the bivariate label. The first one is the Chi Square statistic. But first ... we have to have an understanding of Contingency Tables (also known as crosstabulations, or crosstabs for short).
Crosstabulations explain the distribution of values that two variables have in common.
For example: Suppose we have a theory that hypothesizes a
relationship
between:
Gender (1=male/2=female) and Attitudes Toward Gun control
(1=favor/2=don't
favor).
That is, we hypothesize that men are more in favor of gun
control
than women
S0 Hyp.1: Men are more in Favor of Gun control than Women.
We construct a questionnaire that includes these two concepts, distribute it, and code our data into the computer.
When we ask the computer for a Frequency Distribution of the variables, we can fill in a table that reflects the following:
for Gender: 30% of the sample are men  70% of the sample are women
for Gun Control: 60% of the sample favor control  40% of the sample don't favor control
If we were to ask the computer to spit out a Contingency table using the variables Gender and Gun Control we'd get this:
Gender Male Female Yes Cell1 Cell2 Favor Observed 8% Observed 52% 60% Gun Control No Cell3 Cell4 Observed 22% Observed 18% 40% 30% 70% 100%
Contingency Table 
Gender 

Favor Gun Control 
male 
female 
marginals 
yes 
Cell1 Observed 8% 
Cell2 Observed 52% 
60% 
no 
Cell3 Observed 22% 
Cell4 Observed 18% 
40% 
marginals 
30% 
70% 
100% 
NOTICE The computer gives us the Observed cell
percentages
and:
as you add down the Male column of cells, 8% + 22% = 30%
as you add across the Yes column of cells, 8% + 52% =
60%
as you add down the column marginals, 60% + 40% = 100% annnnnd
..
as you add across the row marginals, 30% + 70% = 100%
This, children, is a two by two (2x2) contingency table and is the simplest form of one. It tells us what we really observed, but doesn't say much about whether or not our observations are different from what we would normallys expect to see ====> W have to insert Expected and Observed percentages on our Contingency Table to do that:
Knowing the Observed Scores (Fo) from the table above, we calculate the Expected Scores (Fe) for each cell (k) in the contingency table using this formula:
Fe_{k} = (row total x column total)/grand total
We have four cells, so we calculate four Fe's:  We already know: 
Fe_{1} = (60 x 30)/100 = 18%  Fo_{1} = 8% 
Fe_{2} = (60 x 70)/100 = 42%  Fo_{2} = 52% 
Fe_{3} = (30 x 40)/100 = 12%  Fo_{3} = 22% 
Fe_{4} = (70 x 40)/100 = 28%  Fo_{4} = 18% 
Now we can insert Expected frequencies in the contingency table:
Contingency Table 
Gender 
Gender 

Favor Gun Control 
Male 
Female 
marginals 
yes 
Cell1  Exp.Fe_{1}18% Obs. Fo_{1} 8% 
Cell2  Exp.Fe_{2} 42% Obs. Fo2 52% 
60% 
no 
Cell3  Exp.Fe_{3}12% Obs. Fo_{3} 22% 
Cell4  Exp.Fe_{4}28% Obs. Fo_{4} 18% 
40% 
marginals 
30% 
70% 
100% 
To calculate the Chi^{2} Statistic: use the formula:
Chi^{2} = (Fo_{k}  Fe_{k})^{2} /Fe_{k }and by substitutuion: ((818)^{2} / 18)+((5242)^{2} /42)+((2212)^{2} /12+((1828)^{2} /28) = 19.84
So the Obtained (calculated) Chi^{2} Value is 19.84
But is 19.84 a statistically significant chi square value?
We need to look in the chi square distribution table (below) ...
and ... we need to know the Degrees of Freedom (df's)
For a 2x2 table the df is 1 because as soon as one cell is filled,
all the others are determined.
Look in the Distribution of Chi Square Table (below) and
find
the place where df = 1.
Follow from left to right until you find a listed "critical"
value of Chi^{2} that is bigger than your calculated one or run
out of values.
For this example df = 1 has a Critical Chi^{2} value of
10.827
at the 99.9th percentile,
or a probability level of p<.001 (that's 100%99.9%=.1% or
p<.001)
Enterpreting these data, we can say either:
men differ significantly from women when it comes to favoring gun control.
or people who favor gun control are more likely to be women than men.
Name ______________________________________________________ Homework Assignment: Contingency Tables and the Chi Square Statistic
1. You are given the following data concerning the relationship between: education and type of community where subjects were raised.
Community in which respondent lived most of the time from age 13 to 19.
Community in which respondent Rural 
lived from age 13 to 19 Urban 
Row Marginals 

Education 12 years or less 
Cell1 Exp._____ Obs. 35% 
Cell2 Exp._____ Obs. 20% 
55% 
Education Over 12 years 
Cell3 Exp._____ Obs. 15% 
Cell4 Exp._____ Obs. 30% 
45% 
Column Marginals 
50% 
50% 
100% 
a. Fill in the table above with Fek values (expected values). Show your work!
b. What is the Calculated Chi^{2} value?
c. Look up the expected, or "critical", Chi^{2} value in the table of chi square distribution?
d. Summarize the nature of the relationship in a few sentences.
e. Generalize from the data.
2. Let us suppose you are interested in studying the relationship between Intelligence and Memory.
You design a study that measures IQ for intelligence, and Cognitive Retention of Beatles Lyrics (i.e., Fill in the blank "I'll buy you a ________ ________ my friend if it makes you feel alright.").
Your hypothesis:
The more intelligent the respondent, the more Beatle lyrics that can be
remembered.
To test this hypothesis, you select a random sample of dormitory residents at the UofA and provide them with a cassette tape with 30 songs on it. You ask respondents to listen to the tape once each day for seven days. At the end of the week you administor the test and measure their intelligence.
Here are the results in crosstab form:
Number of Beatle
015 lyrics 
Lyrics Remembered
Over 15 remembered 
Row
Marginals 

IQ Scores
Under 100 
Cell1 Exp____
Obs. 25% 
Cell2 Exp ____
Obs. 10% 
35% 
IQ Scores
100129 
Cell3 Exp ____
Obs. 15% 
Cell4 Exp. ____
Obs. 20% 
35% 
IQ Scores
Over 129 
Cell5 Exp ____
Obs. 5% 
Cell6 Exp. ____
Obs. 25% 
30% 
Column
Marginals 
45%  55%  100% 
a. Fill in the table above with Expected Values. Show your work!
b. What is the calculated/obtained Chi^{2} value?
c. Look up the expected/critical Chi^{2 }value?
d. Summarize the nature of the relationship in a couple of sentences.
e. Generalize the findings.