8. Fitting probability models to frequency data Flashcards

Question 1

Q

what does chi squared goodness-of-fit test do?

Answer

A

compares counts to a probability distribution

Question 2

Q

Null hyp for chi squared test

Answer

A

The data come from a specified probability distribution

Question 3

Q

Alternate hypothesis for chi squared test

Answer

A

The data do NOT come from a specified probability distribution.

Question 4

Q

Test statistic for chi squared test

Answer

A

X^2 = Sum of all classes (Observed - Expected)^2 / Expected

Question 5

Q

Degrees of freedom Definition

Answer

A

number of degrees of freedom of a test specifies which family of distributions to use

Question 6

Q

Degrees of freedom for chi squared equation

Answer

A

df = (Number of Categories) - (Number of Parameters estimated from the data) - 1

Question 7

Q

P value

Answer

A

probability of getting observed value or something even less likely ased on null hyp

Question 8

Q

What chi squared would be expected for perfect match

Answer

A

0! Though very unlikely

Question 9

Q

P value of chi squared in R

Answer

A

pchisq(chisqu, degrees of freedom, lower.tail = FALSE)

Question 10

Q

P value of chi squared using statistical tables

Answer

A

Table tells what the critical value would be for a given degrees of freedom and given alpha

If drew a line on the chis squared distribution at the critical value, if the null hypothesis was true there would be a 5% chance the calculated value would fall on right of line and 95% chance would fall on left

Find critical value

If calculated value is greater than critical value, can reject null hypothesis

Question 11

Q

Critical value

Answer

A

the value of the test statistic where P = alpha

Where would start to reject null hypothesis

Question 12

Q

test statistic

Answer

A

A number calculated from the data and the null hypothesis that can be compared to a standard distribution to find the P-value of the test

Question 13

Q

relationship between chi squared and binomial test

Answer

A

chi squared could be used as an approximation of the binomial test, could be used even with only two categories and especially useful when LOTS of data points

Question 14

Q

Assumptions of the chi squared test

Answer

A

No more than 20% of categories for EXPECTED<5

NO categories have EXPECTED less than or = 1

Approximation used in chi squared doesn’t work when values are too small

About EXPECTED not observed

If expected values are JUST over 5, probably good to do binomial test instead

Question 15

Q

Discrete distribution

Answer

A

probability distribution describing a discrete numerical random variable

Question 16

Q

Poisson distribution

Answer

Study These Flashcards

A

describes the probability that a certain number of events occur in a block of time or space, when those events happen independently of each other and occur with equal probability at every point in time or space

how many times something happens in a unit of something

ex. divide space into square meters and count flowers within

Question 17

Q

Assumptions of Poisson distribution

Answer

Study These Flashcards

A

Events must happen independently of each other

Events must occur with equal probability at every point in time or space

Question 18

Q

In what ways could poisson distribution be interesting?

Answer

Study These Flashcards

A

To check if events are occuring independently and with equal prob at every point in time or space

Question 19

Q

Poisson distribution formula

Answer

Study These Flashcards

A

Pr[X] = { e^(-u) * u^X } / X!

any number from 0 to infinity could be X, X is the number of events

u (mew) the average number of events per unit

Question 20

Q

How to account for high values in poisson distribution if X = inf or very high number

Answer

Study These Flashcards

A

In practice rarely need high values, calculate specific values precisely then calculate prob of any value greater by 1 - prob of smaller values

Question 21

Q

Process of applying poisson distribution/ chi squared

Answer

Study These Flashcards

A

Determine the mean number of events in given time/space

ex. for number of goals per team per game in world cup,

x bar = (Sum of # of goals * # of times that amount of goals were scored) / (number of games played * 2)

Calculate Pr[X] for each value you’re interested in
Use Pr[X] to find expected actual number for each category
Check if any values are less than 5 or 1. If so, combine categories
Calculate chi squared value for each category
Determine number of degrees of freedom (df = (number of cat) - (number of parameters est from data ) - 1 )
Identify critical value
Compare to critical value, if chi squared is equal to or GREATER than critical value, can reject the null hypothesis

8. Fitting probability models to frequency data Flashcards

(21 cards)