hypothesis testing chi squared Flashcards Preview

biostatistics > hypothesis testing chi squared > Flashcards

Flashcards in hypothesis testing chi squared Deck (23)
Loading flashcards...

whparametric test 

test to estimate at least one population
from sample statistics





what is the assumption made w/ parametric tests 

variable we have measured in the
sample is normally distributed in the population to
which we plan to generalize our findings


what is a non parametric test 

a test that is distribution free,

no assumption on the distribution of the variable in the population


what does the choice of a statistical test depends on:(5)

the Level of measurement for the dependent and
independent variables

Number of groups or dependent measures

Number of units of observation

Type of distribution

The population parameter of interest (mean, variance,
differences between means and/or variances)


examples of parametric and non parametric tests


define a normality test

measures a goodness of fit of a normal model to the data

  • if the fit is poor=>  the data aren't well modeled in respect
    to a normal distribution, without making a judgment on
    any underlying variable.


what is a normality test  used for 

to determine if a data set is modeled by a normal distribution and to calculate  the likelihood for a random variable underlying the data set to be normally distributed.


graphical methods of normality tests 

comparing the histogram of sample (empirical distro) data to a probability curve

it should resemble a bell curve


list the tests of univariate normality 

D'Agostino's K-squared test

Jarque–Bera test

Anderson–Darling test

Cramér–von Mises criterion

Lilliefors test

Kolmogorov–Smirnov test

Shapiro–Wilk test


what is the Kolmogorov–Smirnov test

nonparametric test of the equality of distributions that can be used to compare a sample with a reference distribution = 

(1-sample K–S test) 


to compare 2 samples (2-sample K–S test)


  • quantifies a distance between the empirical distribution betw/ sample and reference// 2 samples



the sample is drawn from thereference distribution

(in the 1-sample case)


the samples are drawn from the same distribution

(in the 2-sample case).


K-S for testing normality of distibutions 

  • samples are standardized and compared with a standard normal distribution.
  • equivalent to
    setting the mean and variance of the reference
    distribution equal to the sample estimates,
  • using these to define the specific reference
    distribution changes the null distribution of the test



what is the chi-squared test 

test is used to check for an association
between 2 categorical variables.


  • H0: There is no association between the variables.
  • HA: There is an association between the variables



what does it mean if two categorical variables are assoc in chi squared test 

the chance that an individual falls into a particular category for one variable depends upon the particular category they fall into for the other variable.


assumptions for the chi squared test 

  • A large sample of independent observations

  • All expected counts should be ≥ 1 (no zeros)

  • At least 80% of expected counts should ≥ 5


define the chi square test 

a test statistic that measures the difference between the observed the expected counts assuming independence.


  • large chi squared rejects null hypothesis because the observed count is diff from the and expected counts 
  • p value of chi squared os probability that the chi squared statistic is large or larger than the value we obtained if H0 is true.


whi is association not causation

observed association between two
variables might be due to the action of
a third, unobserved variable.


limitations of chi squared

  • No categories should be less than 1
  • No more than 1/5 of the expected categories should be less than 5 
  • to fix this:
    • collect larger samples
    • combine your data for the smaller expected categories until their combined value is 5 or more


what is the Yates Correction

  • When there is only 1 degree of freedom, regular chi-test should not be used

  • Apply the Yates correction by

    • subtracting 0.5 from the absolute value of each calculated O-E term,

    • then continue as usual with the new corrected values


what is te Fisher's exact test

  • computes the exact probability under the null
    of obtaining the current distribution of frequencies across cells, or one that is more uneven.
  • test is only available for 2 x 2 tables.


Mann-whitney test 



  • observations from both groups are combined and ranked,

  • withthe average rank assigned in the case of ties

  • If the populations are identical in location, the ranks should be
    randomly mixed between the two samples.


null hypothesis of mann whitney = Two sampled populations are equivalent in location (they have the same mean ranks).


Kruskal-Wallis test for ordinal data independent samples 

  • observations from all groups are combined and ranked,

  • the average rank is assigned in the case of ties.

  • If the populations are identical in location, the ranks
    should be randomly mixed between the K samples

null hypothesis for kruskal willis = K sampled populations are equivalent in location.


Ordinal data 2 related samples.
Wilcoxon signed rank test

  • Two related variables.
  • No assumptions about the shape of distributions of the variables.
  • Takes into account information about the magnitude of differences within pairs
  • gives more weight to pairs that show large differences than to pairs that show small differences.
  • Based on the ranks of the absolute values of the
    differences between the two variables.

null hypothesis for Wilcoxon signed rank = Two variables have the same distribution.