categorical data tests Flashcards

Question 1

Q

chi-square goodness of fit test

Answer

A

“tests whether data come from a specific categorical (multinomial) distribution”
The idea behind the chi-square goodness-of-fit test is to see if the sample comes from the population with the claimed distribution. Another way of looking at that is to ask if the frequency distribution fits a specific pattern.
if the observed x^2 is greater than the expected value (from tables), reject H0

Question 2

Q

contingency table

Answer

A

“a two-dimensional cross-tabulation of frequencies of occurrence for two categorical variables”
most common example: 2x2 table
chi-square tests of association/independence and homogeneity, McNemar’s test for paired data, kappa statistic all use contingency tables

Question 3

Q

chi-square test of association or independence (bt categorical variables)

Answer

A

“used when a single random sample from a single population is obtained and two categorical variables are measured on each subject” ; tests for an association or lack of between those two categorical variables
ex: random sample of college students: is there an association between gender and housing status? is one gender more likely to live on campus than the other?
H0: there is no association bt variables x&y
H1: there is a significant association bt x&y

Question 4

Q

chi-square test of homogeneity (of populations)

Answer

A

“used when samples are obtained from two populations and a single categorical variable is measured”
are the populations the same across levels of the categorical variable?
ex: random sample of men, random sample of women: is the proportion of depression the same in the two populations?
H0: p1=p2 (population proportions in populations 1 & 2 are the same/no difference)
H1: p1≠p2 (population proportions in populations 1 & 2 are not the same/are different)

Question 5

Q

McNemar’s Test (for paired data)

Answer

A

looks for symmetry within a contingency table of categorical data from paired observations (is the probability of an observation being classified in one cell the same as being classified into another)
ex: cases matched with controls in case-control studies, before and after data in the same individual
if calculated x^2 is > than the expected value (from tables) then reject H0

Question 6

Q

kappa statistic

Answer

A

if there is an assumption that there is an association between two variables, kappa statistic can measure the degree of association
“used in reliability studies to quantify the reproducibility of the same variable measured twice”
“a function of the observed and expected concordance rates”

K > 0.75 excellent reproducibility
0.4 ≤ K ≤ 0.75 good reproducibility
0 ≤ K ≤ 0.4 marginal reproducibility

Question 7

Q

Understand the basic structure of a chi-square test statistic

Answer

A

Squared difference b/w observed and expected counts divided by the expected count

Question 8

Q

The # expected is derived from:

Answer

A

the NULL hypothesis in the following manner:

#expected = total sample size* (times)
              % specified in the null

the expected counts are you calculations

compare to see if counts agree

Question 9

Q

What do you do when x squared or test statistic calculated?

Answer

A

when x squared is calculated, we wonder what the size of it is?

we compare observed x squared with the value from chi square tables with 2 df (# of categories -1) and alpha=0.05

x square table value with 2df and alpha (type 1 error)

Then you make your decision to reject or fail to reject ho

(9 cards)