# MIXED VOCAB FOR WINTER Flashcards

1
Q

Minimum sample size for means?

A

If population is normalish, then there is no minimum sample size. If it is skewed or bimodal or any other non-normal distribution, then n>30.

2
Q

What is a sampling distribution?

A

A pile of statistics taken from many many many samples

3
Q

What are the the sample size requirements for inference for both means and proportions?

A
1. [BOTH]You need a random sample. 2. [BOTH] Less than 10% of population 3. [DIFFER] PROPS: np>10 and nq>10, and MEANS n>30 unless from normal population, then no minimum.
4
Q

What is formula for nCr ?

A

n! / r! (n-r) !

5
Q

Describe independence and association with categorical examples.

A

Grade and pizzsa preference are independent, gender and gaming status are associated

6
Q

How do you describe an association between two quantitative variables? (scatter plot)

A

DIRECTION (pos/neg) FORM (linear,curved) STRENGTH (strong, moderate, report “r” value)

7
Q

Describe independence and association with quantitative examples.

A

Height and IQ are independent. Height and weight are associated.

8
Q

function to find a percentile in normal model?

A

INVNORM

9
Q

What is a p-value?

A

The likelihood you obtained your statistic or one more extreme due to just chance if the Null was actually true.

10
Q

What is probability?

A

Long run relative frequency. (the long run percent)

11
Q

Minimum sample size for proportions?

A

You need at lease 10 successes, np>10, and 10 failures, nq > 10

12
Q

Interpret r^2 ?

A

The percent of variablility in Y explained by the model with X

13
Q

What is error?

A

Distance from a statistic to the parameter. How far off your stat is from the truth.

14
Q

What is a Z score?

A

the number of SD a data value is away from the mean

15
Q

What graphs for CATEGORICAL data?

A

segmented bar, bar, pie, mosaic

16
Q

What does SD of residuals tell us?

A

Typical residual. Average distance to the model. About how far off we expect model to be.

17
Q

What is variance?

A

A measure of spread- the average squared distance to the mean. SD^2

18
Q

What points are outliers in regression?

A

Those that don’t follow the flow.

19
Q

function to find area under normal curve?

A

normcdf

20
Q

How do you describe the distribution of a single data set? (a histogram)?

A

SHAPE (#modes, skewness), CENTER (measure of center), SPREAD (measure of spread), STRANGE (outliers or gaps)

21
Q

Interpret SLOPE EQUATION: rSy/Sx

A

For every 1 unit of x, there is a change of SLOPE units of y

22
Q

Diff between standard deviation and standard error?

A

Standard deviation is typical distance to mean for a data point, Standard error is typical distance to parameter for a statistic in a sampling distribution.

23
Q

What is the Law of Large Numbers?

A

In the long run, after many many trials, the % of successes approaches the true probability. Think: if you flip a coin twice, you may get 0% heads, 50% heads or 100% heads. If you flip 10,000 times, you probably will have about 50% heads (def not 0 or 100)

24
Q

Suppose p value = 0.003. How would you interpret?

A

With a p-value this low (0.003 < 0.05), I reject the Ho, there is enough evidence to say [Ha in context]

25
Q

What are the measures of spread we use?

A

standard deviation, variance, range, interquartile range, standard error

26
Q

What points have influence in regression?

A

Those that would change the slope if removed (they are outliers that have leverage)

27
Q

What is margin of error?

A

Distance you reach up and down when making CI. It is CRIT * SE

28
Q

What is alpha?

A

It is the rejection threshold. Reject Ho when p-value is below alpha.

29
Q

What points have leverage in regression?

A

Those far to the left and right from x-bar

30
Q

Interpret y-intercept?

A

When X=0, the model predicts this much Y.

31
Q

What graphs for QUANTITATIVE data?

A

histogram, box/whisker, stemplot, dot plot, ogive, time plot, line graph

32
Q

What are the measures of center we use?

A

mean, median, mode

33
Q

What is a confidence interval?

A

A parameter catcher. It tries to catch the truth.

34
Q

What does “95% confident” mean?

A

If you took 100 samples and made 100 confidence intervals, about 95 would contain the parameter and about 5 would not.

35
Q

What is a test statistic?

A

The number of SE a statistic is away from the hypothesized parameter.

36
Q

What is the golden sentence?

A

I was curious about a population paramter, but a census was too costly so instead I took a sample and used the data to calculate a statistic and then made an inference about the parameter with that statistic.

37
Q

Where are outliers located in a data set ?

A

outside the fences. Lower Q1-1.5IQR and upper Q3+1.5IQR

38
Q

When we combine random variables, what do we add?

A

Add means and add variances. DO NOT ADD ST DEV. You add variances and take the square root of the sum to find combined SD.

39
Q

What does rSy/Sx mean?

A

slope formula. For each SD in X, you go r SD in y

40
Q

Find P( Z > 1.5) ?

A

normcdf( 1.5, 9999)

41
Q

What is a critical value?

A

1 for 68% confidence, 2 for 95 and 3 for 99.7. It is the number of SE you want to reach out in a confidence interval.

42
Q

What are the two sampling distributions we have discussed?

A

MEANS: N ( mu, sigma/root n) and PROPORTIONS: N ( p, root (pq/n) )