WEEK 7 Flashcards

(15 cards)

1
Q

Difference between descriptive statistics and statistical inference

A

Descriptive statistics - THINK what does my data say?
- describes and summarises the main features of a dataset
- focuses on the data you have, no assumptions about the bigger population
- uses the measures of central tendencies (mean, median and mode), variation (standard deviation, range)
- examples include bar charts, histograms and pie charts

Statistical inferences - THINK what can I deduce about the bigger picture?
- focuses on drawing conclusions or make predictions on an entire population
- examples include hypothesis testing (p-values etc), confidence intervals, regression analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Point estimation

A

Using a single value (a “point”) as an estimate for an unknown population parameter.
Example:
You collect a sample of students and find their average test score is 75.
You use 75 as a point estimate for the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Interval estimation (confidence interval)

A

Using a range of values (an interval) that is likely to contain the population parameter.
Example:
“I am 95% confident that the true average score is between 72 and 78.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Hypothesis testing

A

A formal method to test a claim or assumption about a population parameter.

  1. Set up a null hypothesis (H₀) and an alternative hypothesis (H₁)
  2. Use sample data to calculate a test statistic
  3. Decide to reject or not reject H₀, often using a p-value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

3 main types of statistical inference

A

point estimation
interval estimation (confidence interval)
hypothesis testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the sampling distribution

A

a probability distribution of a statistic that is obtained through repeated sampling of a specific population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you find the mean and the standard error of the sampling distribution of x̄

A

μx̄ = μ
this means the mean of the sample means is equal to the population mean.

σx̄ = σ/√n
where:
σ = population standard deviation
n = sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the central limit theorem

A

the sampling distribution of the mean will always be normally distributed, as long as the sample size is large enough

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are confidence intervals

A

a range of values that is likely to contain the true population parameter with a certain level of confidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to write confidence interval

A

typically written as (lower bound, upper bound), with a specified confidence level (e.g., 95%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Assumptions for confidence intervals?

A
  • the data should be a random sample from the population
    -. the measured quantity should be normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the margin of error

A

a statistic expressing the amount of random sampling error in the results of a survey

MOE = Z * (σ / √n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is significance level

A

Threshold probability for rejecting a true null hypothesis
Often represented by ɑ (alpha)

Common values are 0.05 (5%) and 0.01 (1%)
e.g. 0.05 means there’s a 5% chance of rejecting H0 incorrectly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to choose between using a z-test or a t-test

A

Use a z-test if the population standard deviation is known and the sample size is large (n ≥ 30).

Use a t-test if the population standard deviation is unknown and/or sample size is small (n < 30).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a p-value

A

a number describing the likelihood of obtaining the observed data under the null hypothesis of a statistical test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly