Exam 2 Flashcards

(38 cards)

1
Q

Sampling Error

A

Definition: Sampling error is the difference between a sample statistic (like the sample mean) and the actual population parameter (like the population mean).

Cause: It happens because samples are only subsets of the population and may not perfectly represent the population.

Minimizing Sampling Error: Larger sample sizes and random sampling reduce sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Distribution of Sample Means? Properties?

A

Definition: The distribution of sample means consists of all possible sample means that could be obtained from a population.

The Properties are:

Shape: The distribution tends to be normal, especially with larger sample sizes (Central Limit Theorem).

Mean: The mean of the distribution of sample means (μₓ̄) equals the population mean (μ).
μXˉ=μ

Standard Error: The standard deviation of the sample means (σₓ̄) is called the standard error and measures how much the sample means vary from the population mean σM = σ/√n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  • Central Limits Theorem (CLT)
A

things to remember about CLT

1.larger sample sizes lead to smaller standard errors and more accurate sample means.

2.if the sample size is small and the population is not normal, the distribution of sample means may not be normal.

  1. The clt is especially useful for making inferences about population parameters using sample data.

The CLT states that for any population with a mean (μ) and standard deviation (σ), the distribution of sample means will:

Approach a normal distribution as the sample size (n) increases.

Have a mean equal to the population mean (μ).

Have a standard error calculated as:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Know how the CLT describes the Distribution of Sample Means

A

Describe Distribution of Sample Means in terms of:

  1. Shape-The distribution of sample means will tend to be normal (bell-shaped) if the sample size is large enough (n ≥ 30), regardless of the shape of the population distribution.
  2. Central Tendency- Distribution of Sample Means will have a mean the same as the population mean, μ. μX​=μ
  3. Variability- Distribution of Sample Means we measure variability as standard error
    Average distance from sample mean (M) to population mean (μ)

The standard deviation of the distribution of sample means is called the standard error (SE). It is calculated using the population standard deviation (σ) divided by the square root of the sample size (n).
σM = σ/√n

Describe regardless of shape, mean, and standard deviation of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When will the Distribution of Sample Means be normal?

A

1) the population is normal(this means if the sample population is gather from a normal distribution
or
2) size (n) of each sample is ≥ 30( this means that the sample population is normal if it’s greater than or equal to 30)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Standard Error

A
  1. How much each sample differs from one another σM small: samples have similar means (cluster) σM large: samples differ (scattered)
    σM = σ/√n
  2. How well a sample represents the entire distribution
       § Be able to compute it, understand what it measures, how it changes? 

A larger sample size reduces the standard error, making estimates more reliable.

Larger population variability increases the standard error.

A smaller standard error means the sample means are closer to the population mean, indicating more precise estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Null Hypothesis (H0​):
z score

A

-States that there is no effect or no difference between groups.

-It assumes any observed differences are due to chance.

-IF TWO TAILS Example: H0:μ1=μ2​ (No difference in means)

-If one tail H0: μ≤μ1 IF IT WANTS TO KNOW IF IT IMPROVES or H0: μ≥μ1 IF IT WANTS TO KNOW IF IT WORSENS

-If the z-score is in the body then it failed to reject the null

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Alternative Hypothesis (H1): z score

A

-States that there is an effect or a difference.

-Represents what the researcher is trying to prove.

-If two tail Example: H1:μ1≠μ2 (Means are different)

-If one tail H1: μ<μ1 FOR WORSENS or H1: μ>μ1 FOR IMPROVES

-If the z-score lands in the tail then that means IT REJECTED THE NULL HYPHOTHESIS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

One-Tailed Test:

A

-Tests for a directional effect (e.g., “greater than” or “less than”).

Example:
-FOR IMPROVES H0:μ≤50 AND H1:μ>50 (Tests if the mean is significantly greater)
-FOR WORSENS H0:μ>50 AND H1:μ<50 (Tests if the mean IS SIGNIFICANTLY WORSENS

-Critical region is in one tail of the distribution. If the tail is on THE LEFT THAT MEANS THAT IT WORSENS AND IF IT’S IN THE RIGHT THAT MEANS IT IMPROVED

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Two-Tailed Test:

A

-Tests IF IT DIFFERS

-For the alpha u split it in HALF AND PUT ONE IN EACH TAIL AND THE Z-SCORE WILL BE FOR THE NUMBER THAT IS AFTER SPLITTING THE ALPHA INTO TWO

Example:
-H0:μ=50
-H1:μ≠50 (Tests if the mean is significantly different in either direction)

-Critical regions are in both tails of the distribution. This means THAT IF IT LANDS IN THE TAIL THERE IS An ALTERNATIVE HYPOTHESIS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Critical Regions

A

The critical region is the area in the tails of the distribution where the null hypothesis is rejected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Critical Regions of one TAIL TEST

A

-The critical region is in one tail (either left or right).
-Left IF it’s ASKING IF IT worsens OR LESS THAN AND RIGHT IF IT’S ASKING IF IT IMPROVES OR GREATHER THAN

-The alpha of 0.05 will STAY THE SAME BUT WILL BE NEGATIVE IF IT’S IN THE LEFT AND POSITIVE IF IT’S IN THE RIGHT AND 0.01 STAY THE SAME BUT WILL BE NEGATIVE IF IT’S IN THE LEFT AND POSITIVE IF IT’S IN THE RIGHT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Critical Regions OF TWO TAIL TEST

A

The critical region is in both ENDS MEANING TWO tails.

- WE USE IF IT WANTS TO KNOW IF IT DIFFERS

-The alpha of 0.05 will turn INTO 0.025 FOR BOTH TAILS LEFT NEGATIVE AND POSITIVE IN THE RIGHT AND 0.01 TURNS INTO 0.005 FOR BOTH TAILS LEFT NEGATIVE AND POSITIVE IN THE RIGHT

-The critical value is based on the significance level (α), commonly set at 0.05 or 0.01.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Steps of Hypothesis Testing

A

1.State the Hypotheses

Formulate both the null (H0​) and alternative (H1​) hypotheses.

2.Set the Critical Region

Choose a significance level (α) and identify the critical region using a z-table or t-table.

3.Compute the Test Statistic

find THE σM = σ/√n AND THEN FIND THE z = M−µ/σM

4.Make a Decision

Reject THE NULL (h0)​ if the test statistic falls in the critical region(tail).

Fail to Reject null (H0) if the test statistic does not fall in the critical region. (falls in THE BODY)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Type I Error (False Positive):

A

-Occurs when you PUT reject the null hypothesis when it is actually true MEANING THAT IT DID FAIL TO REJECT THE HYPOTHESIS

-Probability: Represented by α (alpha), typically set at 0.05 or 0.01.

-Consequence: Concluding that an effect exists when it does not.

Examples and Consequences:
Medical Research: Approving a drug that is ineffective or harmful.

Legal System: Convicting an innocent person.

Scientific Studies: Reporting a false breakthrough, leading to wasted resources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Type II Error (False Negative):

A

-Occurs when you fail to reject the null hypothesis when it is actually false. MEANING THAT IT ACTUALLY DID REJECT THE NULL HYPOTHESIS

-Probability: Represented by β (beta).
Consequence: Missing a real effect or relationship.

Examples and Consequences:
Medical Research: Failing to approve a beneficial drug that could save lives.

Legal System: Letting a guilty person go free.

Scientific Studies: Missing a valid discovery, slowing down progress in research.

17
Q

What are Alpha & Beta?

A

Alpha (α):
-The probability of making a Type I error.

-When we set the alpha level too high α=0.05 instead of 0.01
Typically set at 0.05 (5%) or 0.01 (1%).

Beta (β):
-The probability of making a Type II error.

-A lower β means greater statistical power (1 - β), making it easier to detect true effects.

-When we set the alpha level too low α of 0.01 instead of 0.05

18
Q

Null Hypothesis (H0​):
t-test

A

There is no difference between the sample mean and the population mean.
-Two tails
H0:μ=50
-One tail
H1:μ≤50(Right tailed)
H1:μ≥50(Left-tailed)

19
Q

-Alternative Hypothesis (H1):
t-test

A

The sample mean is different from the population mean.

-For two-tailed tests: H1:μ≠50
-For one-tailed tests:
H1:μ>50(Right-tailed)
H1:μ<50(Left-tailed)

20
Q

Sample Standard Error
t-test

A

The standard error measures how much the sample mean is expected to differ from the population mean. WHEN THERE’S NO standard DIVIATION
Instead of QM=6 IT WOULD BE SM=6

21
Q

t statistic o Be able to use t formula

A

t Statistic

-Definition: The t statistic compares the difference between the sample mean and the population mean relative to the standard error.

-Formula:
t = M−µ/sM
Where:
M= Sample mean
μ= Population mean
sM= Standard ERROR

22
Q

Why Use Sample Variance?
t-test

A

-When the population standard deviation (σ) is unknown, we estimate it using the sample standard deviation (s).

-The sample variance s2 provides a reliable estimate when using smaller sample sizes.

23
Q

What Affects the t Statistic?

A

Sample Mean: A larger difference between the sample mean and the population mean increases the t value.

-Sample Size: Larger sample sizes reduce the standard error, increasing the t value.
SMALL SAMPLE SIZES INCREASE STANDARD ERROR, DECREASING T VALUE

-Sample Variability: Higher variability (larger s) increases the standard error, reducing the t value.

LOWER VARIABILITY DECREASE STANDARD ERROR, INCREASING T VALUE

24
Q

Degrees of Freedom
t-test

A

Degrees of Freedom
THIS IS FOR T TEST
n – 1 (where n = sample size)

1 df is lost by using the sample mean in calculating s2 (n – 1)

25
Effect Size t-test
-Effect size measures the strength of the difference between the sample mean and the population mean. -Formula is- d = M−µ/s d=0.2 small d=0.5 medium d=0.8 large -If the number is negative u ignore the negative sign -IF the numbers are in between U TAKE THE ONE THAT IT’S THE CLOSEST TO FOR EXAMPLE A 0.7 WOULD BE LIKE A MEDIUM LARGE EFFECT. 0.3 WOULD BE A SMALL EFFECT
26
* Confidence Intervals t-test
-A confidence interval estimates the range in which the population mean is likely to fall. -The mean WILL ALWAYS BE IN BETWEEN THE TWO VARIABLES -THE µ WON'T ALWAYS BE IN BETWEEN THE TWO VARIABLES -Formula is- µ =M ± t(sM) So you would do µ =M + t(sM) AND µ =M - t(sM) Then you put the two nUMBERS INSIDE THIS [lowest,highest] T = Critical t-value from the t-table based on df and confidence level SM = Standard error -IF THE ALPHA IS 0.05= 95% A 95% CI means we are 95% confident that the true population mean falls within the interval. -IF THE ALPHA IS 0.01= 99% A 99% CI means we are 99% confident that the true population mean falls within the interval.
27
Independent measures t test Understand study design
-An independent measures t-test is used to compare the means of two separate groups. -Each group is independent, meaning different participants are in each group (e.g., control vs. experimental group). -Example: Testing if a new therapy reduces anxiety compared to a standard therapy. Independent variable IS WHICH THERAPY GROUP PARTICIPANTS ARE ASSIGN TO DEPENDENT VARIABLE IS ANXIETY -FORMULA t = (M1 −M2 )/s(M1−M2)
28
Estimated Standard Error FOR Independent measures t test
s(M1-M2) -Two sources of error M1 approximates µ1 with some error -M2 approximates µ2 with some error -FORMULA √σ^2-1 + σ^2-2 / √n1+n2
29
Null Hypothesis (H0​): For an independent measures t-test:
-There is no difference between the population means of the two groups. -Two tail test H0:μ1=μ2 -One tail test H0: μ1 ≤ μ2(RIGHT TAIL) H0: μ1 ≥ μ2(Left tail)
30
Alternative Hypothesis (H1​): For an independent measures t-test:
There is a difference between the population means. -For two-tailed tests: H1:μ1≠μ2 -For one-tailed tests: H1:μ1>μ2(right tail-Group 1 has a larger mean) H1:μ1<μ2(Left tail-Group 2 has a larger mean)
31
Pooled Variance When would you pool variances? Why do you pool variances? For an independent measures t-test:
When to Pool Variances anD WHY When N1≠N2 BECAUSE This formula is biased. It treats variances equally when they are not! -Formula for Pooled Variance: s^2 p = SS1+SS2/df1+df2
32
Example of pooled Variance For an independent measures t-test:
-Example Variance Example: n1 = n2 Sample1: SS = 50, n = 5 Sample2: SS = 10, n = 5 We can take the average since n1 = n2 s^2 p = SS1+SS2/df1+df2 s^2 p = 50+10/4+4 s^2 p = 60/8=7.5 BUT Also s^2 1 = SS1/df1= 50/4=12.5 s^2 2 = SS2/df2= 10/4=2.5 12.5+2.5/2=7.5 Since n1=N2 BOTH WAYS ARE THE SAME RESULTS Pooled Variance Example: n1 ≠ n2 Sample1: SS = 20, n = 3 Sample2: SS = 48, n = 9 Pooled Variance s^2 p = SS1+SS2/df1+df2 s^2 p = 20+48/2+8=68/10=6.8 BUT Also s^2 1 = SS1/df1= 20/2=10 s^2 2 = SS2/df2= 48/8= 6 10+6/2=8 When n1 ≠ n2, pooled variance is closer to the variance of the sample with the larger SS2 than the smaller SS1 To calculate Standard Error for poole would bE s(M1−M2)=√(6.8/3+6.8/9) =√2.267+0.756=√1.739 NOW U CAN USE THE t = (M1 −M2 )/s(M1−M2)
33
-Degrees of Freedom For an independent measures t-test:
Take into account df for both sample df=n1+n2−2
34
Assumptions of this test For an independent measures t-test:
1.Each value sampled independently of each other 2. Two populations are normally distributed 3. Two populations must have equal variances σ^2 1 = σ^2 2
35
Homogeneity of Variance: For an independent measures t-test:
Ho: σ^2 1 = σ^2 2 H1: σ^2 1 ≠ σ^2 2 Here, we do not want to reject Ho We want equal variances
36
-Heterogeneity of Variance: For an independent measures t-test:
Samples are drawn from populations with different variances Occurs when: 1) One variance is 4 times more than the other OR 2) Variance is unequal & sample size is unequal
37
Formulas z = M−µ/σM t = M−µ/sM d = M−µ/s t = (M1 −M2 )/s(M1−M2) σM = σ/√n sM = √s2/√n µ =M±t(sM) s^2 p = SS1+SS2/df1+df2 Df=n1+n2-2
- z = M−µ/σM- Z-score formula for a sample mean. - t = M−µ/sM - T-score formula for a sample mean. -d = M−µ/s - Cohen's d (effect size) formula. Use: To measure the effect size, indicating the size of the difference between the sample mean and the population mean in terms of standard deviation. -t = (M1 −M2 )/s(M1−M2) -T-score formula for independent samples. -σM = σ/√n - Formula for the standard error of the mean (SEM) when population standard deviation is known. -sM = √s2/√n -Estimated standard error of the mean (SEM) when population standard deviation is unknown. -µ =M±t(sM)-Formula for constructing a confidence interval for the population mean. -s^2 p = SS1+SS2/df1+df2-the pooled variance in an independent samples t-test -Df=n1+n2-2 -degrees of freedom for independent t score
38
Differences between z and t statistics
The main differences between z and t statistics are: -Population Standard Deviation Z-Statistic: Used when the population standard deviation (σ) is known. T-Statistic: Used when the population standard deviation is unknown and is estimated using the sample standard deviation (s). -Sample Size Z-Statistic: Typically used for large samples (n ≥ 30) because the sampling distribution tends to be normal. T-Statistic: Preferred for small samples (n < 30) since it accounts for more variability. -Distribution Shape Z-Statistic: Follows the standard normal distribution (mean = 0, standard deviation = 1). T-Statistic: Follows the t-distribution, which is wider and has heavier tails, especially with smaller sample sizes. -Degrees of Freedom Z-Statistic: Does not rely on degrees of freedom. T-Statistic: Uses degrees of freedom (df = n - 1) to adjust for sample variability. -Application Z-Statistic: Often used for large-sample hypothesis testing or constructing confidence intervals when the population parameters are known. T-Statistic: Applied in small-sample hypothesis testing or when population parameters are unknown.