Replication Flashcards
(16 cards)
replication
issues to consider, what logic may apply when thinking about type 1 error rate when performing multiple experiments
Karl Popper
“the logic of scientific discovery”
- can never prove something to be true
- construct null hypotheses then falsify them, “at least that can’t be true”
- do not take observations seriously until we have repeated and tested them, inter subjectively testable
- Daryl Bem, Evidence for Anomalous Retroactive Influences on Cognition and Affect, Feeling the Future
evidence for ESP
- the ability to tell the future by chance, “retroactive causality”
observe two windows on screen, knew image was coming on on or the other, half were erotic, half were neutral
- able to tell where erotic picture will appear compared to neutral, even though they are completely random
- showed over five experiments, in journal of personality and social psychology, a leading journal
- this was a bias that something was going wrong in the conduction and publication of this research
Embodied Metaphor and Creative “Acts”, Leung et al.
“if we embody creative metaphors we become more creative”
- put box in room, get people to sit in or out of the box (like the metaphor)
- people outside the box performed better on creativity (generating words with similar letters, new use for the same object
- dubious claim, but again shown over multiple experiments
Attempts to replicate Bem’s study
- independent research group unsuccessfully attempt to replicate Bem’s recall effect
- couldn’t replicate, suggests there was something going on in the lab
Many Labs Project
120 different labs managed research findings and replicated them in their own labs
- 97% were significant in original studies
- 36% were significant in replicated studies, drastic drop
- systematic trend that most data is underneath the expected trend line
- original effect size was estimated to be large, but replication found a much smaller effect size
- the more you increase sample size, higher n is, more precise the estimate of effect size
How could experiments things be replicated poorly?
publication bias, multiple comparisons
publication bias
- journals only publish significant results (p-value <.05)
- it is important for researchers to publish, and to publish well
- this is problematic, unlikely
to be the whole story - 20 experiments for 1 payoff (5%) seems very inefficient
multiple comparisons
- type i error rate = probability of rejecting h0 when true
- decision-wise error rate is 5%
- if more than one test, experiment-wise error rate depends on the number of tests
- 1 - (1-0.05)^2
highly likely you will reject hypotheses erroneously if this isn’t considered
Researchers Degrees of Freedom
“hidden multiple comparisons”
- researchers were making more decisions than they were reporting in the published findings, type 1 error rate shot up, but the rest of audience was left unaware
- performing extra tests/making extra decisions that increases type 1 error rate without letting them know is called researchers degrees of freedom - decisions not left in the final paper
Simmons et al. (2011)
able to show that when people listen to ne of the two given songs, they were made younger than the other
- bonkers result, hidden researchers degrees of freedom
- only presented using fathers age to control for variation (when they tested a variety of others)
- the only successful control is the one they used to present their research
The likelihood of obtaining a false-positive result increases consistently the more you add to a study at different p-levels
at p.<.05
- risk of rejecting h0 when h0 is true is at least 9-12%
- when combining situations it can be as high as 60% that you will reject h0 erroneously
Reviewing degrees of freedom
- not bad by themselves
- simply the use of exploratory data analysis tools
researcher simple attempts to find some ‘truth’ in a data set
- not malicious, it is common practice, small violations
Issue of Non-confirmatory with researchers degrees of freedom
confirmatory: analysis is planned before data collected
- only tests that are directly related to a pre-specified test are performed
- can pre-register these analyses, to protect against researchers degrees of freedom
Solution to researchers degrees of freedom
if you keep in line with pre-registered process and is thus confirmatory, the type 1 error rate will stay at 5%
Exploratory
- unplanned analyses on data set
- usually many tests are performed
- degrees of freedom used
- error rate (type 1) not maintained at 5%