22 Sample Size Flashcards
Six Sigma: A Complete Step-by-Step Guide (18 cards)
What’s a
Type I error risk (alpha)
also called & why?
“Producer risk”
* If H0 is true but rejected due to a Type I error, then material that is within specification is rejected
* This causes waste, extra cost, and lower employee morale
What’s a
Type II error risk (Beta β)
also called & why?
“Consumer risk”
* H0 accepted while false due to Type II error, so material outside of specification is send to consumer
* This causes returns costs, unsatisfied customers, poor brand reputation
What info is required for
choosing sample size?
- Alpha: typically 0.05
- Beta (β): The beta level can be set by the experimenter and a sample size calculated from that number. If the sample size is fixed, then the experimenter usually sets alpha and calculates the beta risk from the sample size
- Delta: The practical difference the experimenter wants to detect using the test
- Standard deviation: The estimated population standard deviation
- Type of Data: Discrete or continuous
- Type of test: Depends on type of data
What is
Alpha (α)?
- The level of significance
- Commonly α is set to .05 / 5%, but can take any value <.05
- The probability of obtaining a results by chance
- Represents the chance of making a Type I Error (rejecting H0 while true)
- The smaller α, the more “unusual” is the results
What is the
power of a test?
- The probability of avoiding a Type II error (accepting H0 while false)
- Power = 1−β
What needs to be considered
when selecting Alpha (α)?
What costs are associated with an unnecessary change, if the team makes a Type I mistake?
* Costs when rejecting materials that fit specifications
What needs to be considered
when selecting beta?
What costs are associated with a Type II error, if the team doesn’t reject H0?
* What is the potential damage or cost if defective materials are passed to the customer?
* Do costs for lost time or resources occur?
What needs to be considered
when selecting Delta
(aka critical difference)?
What margin of error is tolerable?
* How small can a difference be to be insignificant to the customer?
* What is the smallest possible delta to provide all benefits while not being so small as to be unfeasible (as the smaller delta, the larger the sample has to be, which comes at costs)
Guidelines for testing
means for continuous data
Alpha (α) 0.05
Beta (β) 0.10 or 0.20
Delta as absolute value or as a function of std dev (.5σ , 1σ , 2σ)
Guidelines for testing
variance for continuous data
Alpha (α) 0.05
Beta (β) 0.10 or 0.20
Delta >1 or as a function of std dev (.5σ , 1σ , 2σ)
Guidelines for testing
proportions for discrete / binomial data
Alpha (α) 0.05
Beta (β) 0.10 or 0.20
Delta logically or as a function of std dev (.5σ , 1σ , 2σ)
What’s an aspect of the
sample variance test (like the F-test)
sample size calculation?
- Comparing variance from two independent samples / sets of data
- For sample sizing it’s important how much variance is expected in each group
What’s an aspect of the
Design of Experiment (DOE)
sample size calculation?
- It’s about the effects of multiple factors / independent x variables on a response variable
- For sample sizing it’s important how many factors are tested and how many levels each factor has
What assists in setting appropriate
Alpha (α) and Beta (β) values?
- Real-world understanding
- The costs, risks, and implications of errors guide how strict or lenient you should be with α and β.
What concept is key in selecting sample sizes
for various hypothesis testing?
Hypothesis testing error(s)
What’s the
Confidence Interval (CI)?
- Margin of error (ME)
- As the CI provides a range of values within which the true population parameter is likely to fall.
- The margin of error defines how wide that range is, based on the sample data and confidence level (e.g., 95%).
What’s the
Confidence Interval (CI)
also known as?
- Margin of error (ME)
- As the CI provides a range of values within which the true population parameter is likely to fall.
- The margin of error defines how wide that range is, based on the sample data and confidence level (e.g., 95%).
What’s the
Confidence Level?
- Think: How sure am I?
- It’s a percentage that expresses how confident I am that the confidence interval contains the true population parameter.
- Common values: 90%, 95%, 99% sure
- It reflects the reliability of the estimation process.