8. Fitting probability models to frequency data Flashcards

1
Q

what does chi squared goodness-of-fit test do?

A

compares counts to a probability distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Null hyp for chi squared test

A

The data come from a specified probability distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Alternate hypothesis for chi squared test

A

The data do NOT come from a specified probability distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Test statistic for chi squared test

A

X^2 = Sum of all classes (Observed - Expected)^2 / Expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Degrees of freedom Definition

A

number of degrees of freedom of a test specifies which family of distributions to use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Degrees of freedom for chi squared equation

A

df = (Number of Categories) - (Number of Parameters estimated from the data) - 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

P value

A

probability of getting observed value or something even less likely ased on null hyp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What chi squared would be expected for perfect match

A

0! Though very unlikely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

P value of chi squared in R

A

pchisq(chisqu, degrees of freedom, lower.tail = FALSE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

P value of chi squared using statistical tables

A

Table tells what the critical value would be for a given degrees of freedom and given alpha

If drew a line on the chis squared distribution at the critical value, if the null hypothesis was true there would be a 5% chance the calculated value would fall on right of line and 95% chance would fall on left

Find critical value

If calculated value is greater than critical value, can reject null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Critical value

A

the value of the test statistic where P = alpha

Where would start to reject null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

test statistic

A

A number calculated from the data and the null hypothesis that can be compared to a standard distribution to find the P-value of the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

relationship between chi squared and binomial test

A

chi squared could be used as an approximation of the binomial test, could be used even with only two categories and especially useful when LOTS of data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Assumptions of the chi squared test

A

No more than 20% of categories for EXPECTED<5

NO categories have EXPECTED less than or = 1

Approximation used in chi squared doesn’t work when values are too small

About EXPECTED not observed

If expected values are JUST over 5, probably good to do binomial test instead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Discrete distribution

A

probability distribution describing a discrete numerical random variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Poisson distribution

A

describes the probability that a certain number of events occur in a block of time or space, when those events happen independently of each other and occur with equal probability at every point in time or space

how many times something happens in a unit of something

ex. divide space into square meters and count flowers within

17
Q

Assumptions of Poisson distribution

A

Events must happen independently of each other

Events must occur with equal probability at every point in time or space

18
Q

In what ways could poisson distribution be interesting?

A

To check if events are occuring independently and with equal prob at every point in time or space

19
Q

Poisson distribution formula

A

Pr[X] = { e^(-u) * u^X } / X!

any number from 0 to infinity could be X, X is the number of events

u (mew) the average number of events per unit

19
Q

How to account for high values in poisson distribution if X = inf or very high number

A

In practice rarely need high values, calculate specific values precisely then calculate prob of any value greater by 1 - prob of smaller values

20
Q

Process of applying poisson distribution/ chi squared

A
  1. Determine the mean number of events in given time/space

ex. for number of goals per team per game in world cup,

x bar = (Sum of # of goals * # of times that amount of goals were scored) / (number of games played * 2)

  1. Calculate Pr[X] for each value you’re interested in
  2. Use Pr[X] to find expected actual number for each category
  3. Check if any values are less than 5 or 1. If so, combine categories
  4. Calculate chi squared value for each category
  5. Determine number of degrees of freedom (df = (number of cat) - (number of parameters est from data ) - 1 )
  6. Identify critical value
  7. Compare to critical value, if chi squared is equal to or GREATER than critical value, can reject the null hypothesis