Stats 216 Flashcards
(47 cards)
What is “Bootstrapping”?
Bootstrapping is a method for answering the question:
“How can we estimate sampling variability if we only have one sample?”
Suppose we are answering the question “What proportion of breakups occur on Monday?” but we only have one sample of 50 breakups as reported on Facebook. From this sample we can see that 26% of these breakups (13 out of the 50) happen on a Monday. Since this is just one sample of 50 people, we know that this is very unlikely to be exactly correct– overall it will be close to 26% of all breakups that will happen on a Monday, but we don’t know what it will be in general for the entire population. We can estimate the proportion for all breakups by using the Bootstrap method.
We use our sample, 13 out of 50 breakups, and then select a breakup at random from this population one at a time to create a new sample of 50 breakups. Every time we sample, we replace it (sample with replacement). This allows us to create as many samples as we want. In other words we randomly re-sampling with replacement to create a new sample. This can be repeated as many times as desired.
https: //umontana.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=29c86e39-c1f7-4fde-bb50-adbb014fd399
https: //www.youtube.com/watch?v=4ZLHFSzCmhg
What are “marginal” and “conditional” distributions?
Marginal and conditional distributions can be found the same two-way table.
Marginal distributions are the totals for the probabilities. They are found in the margins (that’s why they are called “marginal”).
The following table shows probabilities for rolling two dice. The total probabilities in the margins are the marginal distributions.
What are the 6 principles regarding p-values from the American Statistical Association
Principle 1: P-values can indicate how incompatible the data are with a specified statistical model.
Principle 2: P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
Principle 3: Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
Principle 4: Proper inference requires full reporting and transparency.
Principle 5: A p-value does not measure the size of an effect or the importance of a result.
Principle 6: By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.
Write a null hypothesis for “Are teens better at math than adults?”
Being a teenager or adult has no effect on mathematical ability.
Write a null hypothesis for “Does taking aspirin every day reduce the chance of having a heart attack?”
Taking aspirin daily does not affect heart attack risk.
Write a null hypothesis for “Do teens use cell phones to access the internet more than adults?”
Being a teenager or an adult has no effect on the use of cell phones.
Write a null hypothesis for “Does the color of cat food affect the cats choice of which food they eat?”
Cats express no food preference based on color.
Write a null hypothesis for “Does chewing willow bark relieve pain?”
Chewing willow bark has no effect on pain relief.
There is no difference in pain relief after chewing willow bark versus taking a placebo.
How do you know if you have run enough simulations using a Monte Carlo simulation?
The confidence interval is calculated from the sample’s size and standard deviation and the chosen confidence level (typically 90%, 95%, or 99%).
Running even more samples will narrow the confidence interval.
Too few samples and you get inaccurate outputs, graphs (particularly histogram plots) that look ‘scruffy’;
Too many samples and it takes a long time to simulate, and it may take even longer to plot graphs, export and analyze data, etc afterwards.
If the output of greatest interest is graphical, you will need a plot that would not change to any meaningful degree by running more samples (i.e. it is stable).
What is a “Bernoulli Trial”
What is a “Binomial Random Variable”?
What is a “parameter”? What is a “statistic”?
A “parameter” is characteristic of a population.
A “statistic” is any measurement we make from a sample.
What is a “population”? What is a “sample”?
Why and when do we use bootstrapping?
We use bootstrapping to estimate sampling variability. We need to do this when we only have one sample.
For example, if we only had one sample of 50 breakups reported on FaceBook and wanted to generalize to all of the breakup reported on Facebook. We could use bootstrapping to determine the variability (get a measure of reliability) of our prediction based on our single sample of 50 breakups.
Suppose you made an estimate about a population parameter using a statistic from a sample of the population. How would you quantify the reliability of your statistic?
What does a margin of error quantify?
A margin of error quantifies the uncertainty in the estimate. It tells us the amount of “give or take” around the sample estimate that is reasonable.
Margin of Error=2×SD of sampling distribution
What is a “compatibility interval”?
A compatibility interval is the range of values for which you can be 95% certain that the population parameter will fall into.
Compatibility Interval = Sample Estimate±Margin of Error
= Sample Estimate ± 2xStandard Deviation of Sample
Note: While statisticians and polling organizations tend to use two SDs to compute the margin of error, this is a somewhat arbitrary choice. Some researchers choose one or three SDs of the sampling distribution.
In any statistical estimate, we are concerned with two things:
The estimate and the uncertainty.
To evaluate the external validity evidence, we need to consider…
… representativeness and uncertainty.
Representativeness. Is the sample representative of the population?
Uncertainty: Did the researchers account for uncertainty in the estimate?
In the case of the average global temperatures lesson:
Yes. This study used a random sample of points on the Earth. This is an unbiased sampling method, which means that the sample is a representative sample.
Yes. In this case the uncertainty comes from sampling variability. We accounted for this uncertainty with the margin of error and compatibility interval.
Notes:
When evaluating external validity, make sure that your response attends to both representativeness and uncertainty.
For representativeness, you should focus on whether the sampling method is biased. This study uses random sampling, which is an unbiased method
For uncertainty, you should consider sampling variability. In this case we estimated sampling variability using the bootstrap model and we accounted for the uncertainty from sampling variability with our margin of error and compatibility interval.
Also, please be sure that your evaluation of external validity does not attend to extraneous factors other than representativeness and uncertainty.
Write Null Hypothesis
Are teens better at math than adults?
Age has no effect on mathematical ability.
Write Null Hypothesis
Does taking aspirin every day reduce the chance of having a heart attack?
Taking aspirin daily does not affect heart attack risk.
Null Hypothesis Practice
Do teens use cellphones more to access the internet than adults?
Being a teenager or an adult does not effect how often cellphones are used to access the internet.
To attribute a causal relationship, there are three criteria a researcher needs to establish:
- Association of the Cause and Effect: There needs to be an association between the cause and effect. (We do this via Hypothesis Testing)
- Timing: The cause needs to happen BEFORE the effect.
- No Plausible Alternative Explanations: ALL other possible explanations for the effect need to be ruled out. Random assignment removes any systematic differences between the groups (other than the treatment), and thus helps to rule out plausible alternative explanations.
What are “sampling” and “assignment”?
Sampling refers to how participants were selected from the population.
Assignment refers to how the selected participants (participants in the sample) are assigned to comparison groups. For example, treatment group or control group.