analysing categorical data (chi-squared) Flashcards
How do we analyse categorical data?
-Predict category that someone falls into
-Create contingency table and perform chi-squared test of the data
What is a contingency table?
-Table of frequencies for how often an observation occurs in a category
-Categories have to be mutually exclusive and exhaustive
Describe the use of a chi-squared test
-Calculates how often an observation falls into a category based on how many were expected by chance
What is the use of the null hypothesis in this situation?
-Means that the frequencies observed were expected by chance
What is the use of the experimental hypothesis in this situation?
-Means that the frequencies observed reflect real differences in categories
What 2 assumptions can be made?
1: Independence
-Each person can only contribute to one cell of a contingency table
2: Expected frequencies
-All expected counts should be greater than 1 and no more than 20% of expected counts should be less than 5-If violated, power can be lost
What can we do to prevent loss of power?
-Use an exact test instead e.g. fisher’s or MLR
-Collapse/remove data across one variable
-Collapse levels of one variable
-Collect more data
What are the 3 steps for a chi-squared formula?
1: Calculate expected frequencies
2: Calculate chi-squared value based on observed and expected
3: Compare chi-squared value against critical value
What differs if you have 2 IV’s?
-Difference will be calculating the expected values in each case
-Have to calculate expected frequencies of specific cells
What is a binomial test?
-Compares observed and expected frequencies for variable with only 2 levels
-E.g. Are there more people in the sample from USA than what we would expect by chance?