8 the replication crisis and the open science movement Flashcards

Question

What is the current understanding of probability and statistical significance?

Answer 1

quantities of hypothetical frequencies of data patterns under an assumed statistical model → frequentist methods p-value = observed significance level probability that the chosen test statistic would have been at least as large as its observed value if every model assumption were correct, including the test hypothesis the P value tests all the assumptions about how the data were generated (the entire model), not just the targeted hypothesis it is supposed to test (such as a null hypothesis) -> says nothing specifically related to that hypothesis -> number computed from the data, unknown before computation continuous measure of the compatibility between the data and the entire model used to compute it, ranging from 0 for complete incompatibility to 1 for perfect compatibility

Answer 2

NO it assumes the test hypothesis is true it indicates the degree to which the data conforms to the pattern predicted by the model it is NOT " p = 0.01 means the null hypothesis has a 1% chance of being true" p = 0.01 means that the data are not very close to the statistical model

Answer 3

it is not telling us that a certain percentage accounts for chance alone having produced the effect likelihood to see an effect if null hypothesis were true p = 0.08 "if the medicine had no real effect, there would still be an 8% chance of seeing these improvements just by random chance" how surprising is the data if the null hypothesis were true? difference: chance is not operating alone it is not "there is 8% of it being due to chance" it is "if there is truly no effect, there is still an 8% probability of the data occuring"

Answer 4

a small p value doesnt mean the test hypothesis is true or not it means that the data is quite unusual if all the assumptions were correct if our starting assumption were true (no real differences), there's only a 5% chance (or less) that we'd see results as extreme as we did just by random chance.

Answer 5

NOT a favour of the test hypothesis NOT no effect if its not 1, it implies that another hypothesis is most compatible indicates only that the data are incapable of discriminating among many competing hypotheses

Answer 6

not really from p-value -> only comments on likelihood of data occurring under the present assumptions more from CI -> effect sizes are substantive?

Answer 7

certain effect sizes can operate under many different models and hypotheses therefore, the p-value doesn't allow for inference of small or large effect size always CI

Answer 8

it is a fallacy to assume that if the majority of studies have p > 0.05, that the overall evidence supports the hypothesis -> all could fail to reach significance, but with statistical combination they could show it -> individual studies do not allow for inference of effect!

Answer 9

very differences in population and resulting SE make certain p-values across studies not comparable therefore, if two studies have same p-value this doesn't mean the results are in agreement -> might be due to different differences

Answer 10

range between two numbers it estimates the frequency with which an observed interval contains the true effect size if all assumptions are correct It's about the long-run accuracy of the method if you repeated the process many times under similar conditions. The "95%" in a 95% confidence interval means that if we were to repeat the study many times, 95% of the confidence intervals calculated from those studies would contain the true effect size, assuming all the assumptions hold

Answer 11

Not a Probability Statement About the Specific Interval: For a given study, the 95% CI doesn't mean there's a 95% chance that this specific interval contains the true value. It's about the long-term accuracy of the interval if the experiment were repeated many times. Not a Refutation or Confirmation: Just because an effect size lies outside a 95% CI doesn't mean it's refuted or excluded. It signals that, under the assumptions made, such a result would be unusual, but it doesn't provide absolute proof. Not About Overlapping Intervals: Overlapping 95% CIs from two groups or studies do not necessarily mean there's no significant difference between the groups. Nor does non-overlapping imply a significant difference.

Answer 12

Power is about the capability of a test to detect an effect when it's there, not about the probability of making an error in one specific test. High power doesn't validate the null hypothesis; it just means the test is likely to detect an effect if it exists. The power of a statistical test is the probability that it will correctly reject a false null hypothesis. For example, if a test has 90% power, it means there's a 90% chance the test will detect an effect if there is an effect to be detected.

Answer 13

a certain percentage for being wrong doesnt apply to one single study, it applies to a range of administrations of the same model with the same power you cannot say "i have a probability of being 10% wrong cus my power is 90%" if p > 0.05 -> doesn't mean this supports the null over the alternative -> just means that in this test, even though the test has high power (is very good at detecting a true difference) it failed to detect sth

Answer 14

The pre-experiment probability that the test will reject the test hypothesis. This is usually the probability that p will not exceed .05.

Answer 15

- examining the sites of effect estimates and confidence limits as well as precise P values - critical examination of assumptions and conventions used - if nonsignificance → alternative hypothesis - interval estimates CI → then P value depend on uncertain statistical model - pooled analysis or meta-analyses to remove study biases - Any opinion offered about the probability, likelihood, certainty, or similar property for a hypothesis cannot be derived from statistical methods alone - research reports (including meta-analyses) should describe in detail the full sequence of events that led to the statistics presented, including the motivation for the study, its design, the original analysis plan, the criteria used to include and exclude subjects (or studies) and data, and a thorough description of all the analyses that were conducted.

Answer 16

CI Perspective: Provides a range of plausible values for the true effect and can help assess the precision of the estimate (wider intervals indicate less precision). -> reliability over many administrations -> possibility to discriminate between hypotheses P-value Perspective: Provides a metric for judging whether the observed data would be surprising if the null hypothesis were true, but doesn't quantify the probability of the hypothesis itself being true. -> how extreme/likely are the data under the assumption that there is no effect