analysing categorical data Flashcards
(11 cards)
how can you analyse categorical data?
create a contingency table
perform a chi-square test (do people fall into a category more often than we expect them too?)
what are contingency tables?
a table for frequencies for how often an observation occurs in a category
what must categories be in a contingency table?
mutually exclusive
exhaustive
what is a chi-square test?
devised by Karl Pearson in 1900, also known as Pearson’s chi-square
calculates how often a particular observation falls into a category based on how many were expected by chance
what is the null hypothesis in a chi-square test?
the frequencies observed were expected by chance
what is the alternative hypothesis in a chi-square test?
the frequencies observed reflect real differences in categories
what are the assumptions in a chi-square test?
independence - each person can only contribute to one cell of a contingency table
expected frequencies - all expected counts should be greater than 1 and no more than 20% of expected counts should be less than 5 - if violated, power reduced
what are the options if expected frequencies assumption violated?
use an “exact” test instead (e.g. Fisher’s or MLR)
collapse/remove data across one variable
collapse levels of one variable
collect more data
accept the loss of power
what are the steps of chi-square by hand with one IV?
calculate expected frequencies
calculate chi-square value based on observed and expected frequencies
compared chi-square value against critical values table
to interpret table, need to know degrees of freedom (number of categories - 1) and our desired alpha value
reject H- when X2observed > X2critical
how is chi-square with two IVs calculated?
with two IVs, difference will be in calculating the expected values in each case
to calculate expected frequencies for two IVs, need to calculate expected frequencies of specific cells
degrees of freedom = (number of rows - 1) x (number of cells - 1)
what is a binomial test?
compares observed and expected frequencies for variable with only two levels