Chapter 6 - Statistical Data Flashcards

1
Q

We collect data from ___________ and use them to ____________.

A

real world experiences, draw conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

We choose a ___________ to study, collect measurements from a small representative subset or ________ of that population, then apply our findings back to the larger population to assess if they ____.

A

population, sample, and fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a simple model?

A

An average or a mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is one way to calculate a simple model?

A

Calculate an average, then look at the difference between each data point and the average value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can we observe how well our data fits the model?

A

Take the sum of the squared differences (Standard Deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you quantify model accuracy?

A

Model + Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the frequency distribution indicate in the deviation around the average? There are two main take-aways

A
  • Flat Distribution = More deviation
  • Skewed Distribution = a few outliers pulling the average up or down
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the Standard Deviation measure?

A

The amount of variation among the individuals you sampled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does the Standard Deviation measure variation among variables?

A

By comparing the values / measurements of each value to the mean of all values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you retrieve the Standard Error?

A
  1. Take a bunch of subsamples
  2. Calculate the mean of each subsamples
  3. Calculate the standard deviation of the subsample means
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Confidence Interval?

A

Calculating the top and bottom levels within which a measurement will fall 95% of the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do we use to find the Confidence Intervals?

A

A formula to calculate the range of values in which 95% of the actual measurements occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do Confidence Intervals indicate?

A

Being 95% sure that the true population means that 95% of your population will fall within that range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the equation of the simplest model?

A

Linear Model: y = mx + b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True or False: Not all relationships can be modelled well using a straight line.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the stages of constructing a Model?

A
  1. Stating a Problem
  2. Developing a Hypothesis about a population
  3. Make a prediction
  4. Collect data from sample
  5. Fit a model to the data points
  6. Test how well model represents
17
Q

What does it mean if a model does not explain variation (relationships) between data points very well?

A

We have less confidence bin that model to predict what’s going on in the broader population.

18
Q

What are two ways to compare means?

A
  1. Independent Sample T-Tests
  2. Analysis of Variance (ANOVA)
19
Q

What does ANOVA stand for?

A

Analysis of Variance

20
Q

Which method of comparing means takes “two groups that are independent of each other”?

A

Independent Sample T-Test

21
Q

Which method of comparing means takes “more than two groups”?

A

Analysis of Variance

22
Q

When comparing means, how can you determine whether the difference between the means is “real”?

A

By analyzing the p-value.

23
Q

What does the p-value indicate in comparing differences in means?

A

It incorporates
- the magnitude of difference in means
- the sample size within each group, and
- the variation of values in each group to make its judgement

24
Q

The __________ tells you whether the difference between groups is probably just a function of random chance or whether it’s something that likely holds true throughout the population.

25
How can you test whether two numeric variables are significantly related to one another?
By using a correlation coefficient or linear regression analysis.