categorical data tests Flashcards

(9 cards)

1
Q

chi-square goodness of fit test

A
  • “tests whether data come from a specific categorical (multinomial) distribution”
  • The idea behind the chi-square goodness-of-fit test is to see if the sample comes from the population with the claimed distribution. Another way of looking at that is to ask if the frequency distribution fits a specific pattern.
  • if the observed x^2 is greater than the expected value (from tables), reject H0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

contingency table

A
  • “a two-dimensional cross-tabulation of frequencies of occurrence for two categorical variables”
  • most common example: 2x2 table
  • chi-square tests of association/independence and homogeneity, McNemar’s test for paired data, kappa statistic all use contingency tables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

chi-square test of association or independence (bt categorical variables)

A
  • “used when a single random sample from a single population is obtained and two categorical variables are measured on each subject” ; tests for an association or lack of between those two categorical variables
  • ex: random sample of college students: is there an association between gender and housing status? is one gender more likely to live on campus than the other?
  • H0: there is no association bt variables x&y
    H1: there is a significant association bt x&y
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

chi-square test of homogeneity (of populations)

A
  • “used when samples are obtained from two populations and a single categorical variable is measured”
  • are the populations the same across levels of the categorical variable?
  • ex: random sample of men, random sample of women: is the proportion of depression the same in the two populations?
  • H0: p1=p2 (population proportions in populations 1 & 2 are the same/no difference)
    H1: p1≠p2 (population proportions in populations 1 & 2 are not the same/are different)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

McNemar’s Test (for paired data)

A
  • looks for symmetry within a contingency table of categorical data from paired observations (is the probability of an observation being classified in one cell the same as being classified into another)
  • ex: cases matched with controls in case-control studies, before and after data in the same individual
  • if calculated x^2 is > than the expected value (from tables) then reject H0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

kappa statistic

A
  • if there is an assumption that there is an association between two variables, kappa statistic can measure the degree of association
  • “used in reliability studies to quantify the reproducibility of the same variable measured twice”
  • “a function of the observed and expected concordance rates”

K > 0.75 excellent reproducibility
0.4 ≤ K ≤ 0.75 good reproducibility
0 ≤ K ≤ 0.4 marginal reproducibility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Understand the basic structure of a chi-square test statistic

A

Squared difference b/w observed and expected counts divided by the expected count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The # expected is derived from:

A

the NULL hypothesis in the following manner:

#expected = total sample size* (times)
              % specified in the null

the expected counts are you calculations

compare to see if counts agree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do you do when x squared or test statistic calculated?

A

when x squared is calculated, we wonder what the size of it is?

we compare observed x squared with the value from chi square tables with 2 df (# of categories -1) and alpha=0.05

x square table value with 2df and alpha (type 1 error)

Then you make your decision to reject or fail to reject ho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly