Exam 2 Flashcards
(38 cards)
Sampling Error
Definition: Sampling error is the difference between a sample statistic (like the sample mean) and the actual population parameter (like the population mean).
Cause: It happens because samples are only subsets of the population and may not perfectly represent the population.
Minimizing Sampling Error: Larger sample sizes and random sampling reduce sampling error
What is the Distribution of Sample Means? Properties?
Definition: The distribution of sample means consists of all possible sample means that could be obtained from a population.
The Properties are:
Shape: The distribution tends to be normal, especially with larger sample sizes (Central Limit Theorem).
Mean: The mean of the distribution of sample means (μₓ̄) equals the population mean (μ).
μXˉ=μ
Standard Error: The standard deviation of the sample means (σₓ̄) is called the standard error and measures how much the sample means vary from the population mean σM = σ/√n
- Central Limits Theorem (CLT)
things to remember about CLT
1.larger sample sizes lead to smaller standard errors and more accurate sample means.
2.if the sample size is small and the population is not normal, the distribution of sample means may not be normal.
- The clt is especially useful for making inferences about population parameters using sample data.
The CLT states that for any population with a mean (μ) and standard deviation (σ), the distribution of sample means will:
Approach a normal distribution as the sample size (n) increases.
Have a mean equal to the population mean (μ).
Have a standard error calculated as:
Know how the CLT describes the Distribution of Sample Means
Describe Distribution of Sample Means in terms of:
- Shape-The distribution of sample means will tend to be normal (bell-shaped) if the sample size is large enough (n ≥ 30), regardless of the shape of the population distribution.
- Central Tendency- Distribution of Sample Means will have a mean the same as the population mean, μ. μX=μ
- Variability- Distribution of Sample Means we measure variability as standard error
Average distance from sample mean (M) to population mean (μ)
The standard deviation of the distribution of sample means is called the standard error (SE). It is calculated using the population standard deviation (σ) divided by the square root of the sample size (n).
σM = σ/√n
Describe regardless of shape, mean, and standard deviation of population
When will the Distribution of Sample Means be normal?
1) the population is normal(this means if the sample population is gather from a normal distribution
or
2) size (n) of each sample is ≥ 30( this means that the sample population is normal if it’s greater than or equal to 30)
Standard Error
- How much each sample differs from one another σM small: samples have similar means (cluster) σM large: samples differ (scattered)
σM = σ/√n - How well a sample represents the entire distribution
§ Be able to compute it, understand what it measures, how it changes?
A larger sample size reduces the standard error, making estimates more reliable.
Larger population variability increases the standard error.
A smaller standard error means the sample means are closer to the population mean, indicating more precise estimates.
Null Hypothesis (H0):
z score
-States that there is no effect or no difference between groups.
-It assumes any observed differences are due to chance.
-IF TWO TAILS Example: H0:μ1=μ2 (No difference in means)
-If one tail H0: μ≤μ1 IF IT WANTS TO KNOW IF IT IMPROVES or H0: μ≥μ1 IF IT WANTS TO KNOW IF IT WORSENS
-If the z-score is in the body then it failed to reject the null
Alternative Hypothesis (H1): z score
-States that there is an effect or a difference.
-Represents what the researcher is trying to prove.
-If two tail Example: H1:μ1≠μ2 (Means are different)
-If one tail H1: μ<μ1 FOR WORSENS or H1: μ>μ1 FOR IMPROVES
-If the z-score lands in the tail then that means IT REJECTED THE NULL HYPHOTHESIS
One-Tailed Test:
-Tests for a directional effect (e.g., “greater than” or “less than”).
Example:
-FOR IMPROVES H0:μ≤50 AND H1:μ>50 (Tests if the mean is significantly greater)
-FOR WORSENS H0:μ>50 AND H1:μ<50 (Tests if the mean IS SIGNIFICANTLY WORSENS
-Critical region is in one tail of the distribution. If the tail is on THE LEFT THAT MEANS THAT IT WORSENS AND IF IT’S IN THE RIGHT THAT MEANS IT IMPROVED
Two-Tailed Test:
-Tests IF IT DIFFERS
-For the alpha u split it in HALF AND PUT ONE IN EACH TAIL AND THE Z-SCORE WILL BE FOR THE NUMBER THAT IS AFTER SPLITTING THE ALPHA INTO TWO
Example:
-H0:μ=50
-H1:μ≠50 (Tests if the mean is significantly different in either direction)
-Critical regions are in both tails of the distribution. This means THAT IF IT LANDS IN THE TAIL THERE IS An ALTERNATIVE HYPOTHESIS.
Critical Regions
The critical region is the area in the tails of the distribution where the null hypothesis is rejected.
Critical Regions of one TAIL TEST
-The critical region is in one tail (either left or right).
-Left IF it’s ASKING IF IT worsens OR LESS THAN AND RIGHT IF IT’S ASKING IF IT IMPROVES OR GREATHER THAN
-The alpha of 0.05 will STAY THE SAME BUT WILL BE NEGATIVE IF IT’S IN THE LEFT AND POSITIVE IF IT’S IN THE RIGHT AND 0.01 STAY THE SAME BUT WILL BE NEGATIVE IF IT’S IN THE LEFT AND POSITIVE IF IT’S IN THE RIGHT
Critical Regions OF TWO TAIL TEST
The critical region is in both ENDS MEANING TWO tails.
- WE USE IF IT WANTS TO KNOW IF IT DIFFERS
-The alpha of 0.05 will turn INTO 0.025 FOR BOTH TAILS LEFT NEGATIVE AND POSITIVE IN THE RIGHT AND 0.01 TURNS INTO 0.005 FOR BOTH TAILS LEFT NEGATIVE AND POSITIVE IN THE RIGHT
-The critical value is based on the significance level (α), commonly set at 0.05 or 0.01.
Steps of Hypothesis Testing
1.State the Hypotheses
Formulate both the null (H0) and alternative (H1) hypotheses.
2.Set the Critical Region
Choose a significance level (α) and identify the critical region using a z-table or t-table.
3.Compute the Test Statistic
find THE σM = σ/√n AND THEN FIND THE z = M−µ/σM
4.Make a Decision
Reject THE NULL (h0) if the test statistic falls in the critical region(tail).
Fail to Reject null (H0) if the test statistic does not fall in the critical region. (falls in THE BODY)
Type I Error (False Positive):
-Occurs when you PUT reject the null hypothesis when it is actually true MEANING THAT IT DID FAIL TO REJECT THE HYPOTHESIS
-Probability: Represented by α (alpha), typically set at 0.05 or 0.01.
-Consequence: Concluding that an effect exists when it does not.
Examples and Consequences:
Medical Research: Approving a drug that is ineffective or harmful.
Legal System: Convicting an innocent person.
Scientific Studies: Reporting a false breakthrough, leading to wasted resources.
Type II Error (False Negative):
-Occurs when you fail to reject the null hypothesis when it is actually false. MEANING THAT IT ACTUALLY DID REJECT THE NULL HYPOTHESIS
-Probability: Represented by β (beta).
Consequence: Missing a real effect or relationship.
Examples and Consequences:
Medical Research: Failing to approve a beneficial drug that could save lives.
Legal System: Letting a guilty person go free.
Scientific Studies: Missing a valid discovery, slowing down progress in research.
What are Alpha & Beta?
Alpha (α):
-The probability of making a Type I error.
-When we set the alpha level too high α=0.05 instead of 0.01
Typically set at 0.05 (5%) or 0.01 (1%).
Beta (β):
-The probability of making a Type II error.
-A lower β means greater statistical power (1 - β), making it easier to detect true effects.
-When we set the alpha level too low α of 0.01 instead of 0.05
Null Hypothesis (H0):
t-test
There is no difference between the sample mean and the population mean.
-Two tails
H0:μ=50
-One tail
H1:μ≤50(Right tailed)
H1:μ≥50(Left-tailed)
-Alternative Hypothesis (H1):
t-test
The sample mean is different from the population mean.
-For two-tailed tests: H1:μ≠50
-For one-tailed tests:
H1:μ>50(Right-tailed)
H1:μ<50(Left-tailed)
Sample Standard Error
t-test
The standard error measures how much the sample mean is expected to differ from the population mean. WHEN THERE’S NO standard DIVIATION
Instead of QM=6 IT WOULD BE SM=6
t statistic o Be able to use t formula
t Statistic
-Definition: The t statistic compares the difference between the sample mean and the population mean relative to the standard error.
-Formula:
t = M−µ/sM
Where:
M= Sample mean
μ= Population mean
sM= Standard ERROR
Why Use Sample Variance?
t-test
-When the population standard deviation (σ) is unknown, we estimate it using the sample standard deviation (s).
-The sample variance s2 provides a reliable estimate when using smaller sample sizes.
What Affects the t Statistic?
Sample Mean: A larger difference between the sample mean and the population mean increases the t value.
-Sample Size: Larger sample sizes reduce the standard error, increasing the t value.
SMALL SAMPLE SIZES INCREASE STANDARD ERROR, DECREASING T VALUE
-Sample Variability: Higher variability (larger s) increases the standard error, reducing the t value.
LOWER VARIABILITY DECREASE STANDARD ERROR, INCREASING T VALUE
Degrees of Freedom
t-test
Degrees of Freedom
THIS IS FOR T TEST
n – 1 (where n = sample size)
1 df is lost by using the sample mean in calculating s2 (n – 1)