Statistical Workflow Flashcards

1
Q

Problems With P

A
  • A p-value is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct.
  • University Graduates IQ vs. Primary School Students IQ
  • University Graduates IQ vs. Secondary School Students IQ
  • Depression in left-handed people vs. depression in right-handed people
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

• But less probable results are still possible!

A
  • You could roll a 20 on a 20-sided dice
  • You could toss a coin 10 times and get 10 tails
  • You could measure significantly more depression in left-handed than in right-handed people
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

• Publication bias

A
  • Less probable results are often more interesting than probable results
  • Significant results are more interesting than non-significant results
  • probable results are often more interesting than probable results
  • Unusual findings might open new areas of investigation
  • Challenges to existing theory
  • Significant results are more interesting/respected than non-significant results
  • File drawer problem
  • Positive-results bias, a type of publication bias, occurs when authors are more likely to submit, or editors are more likely to accept, positive results than negative or inconclusive results.
  • Lots of non-significant studies never get published so literature does not necessarily show a balanced picture
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bad science:

A
  • A researcher could keep running the same experiment until they get a significant result (see bonus video)
  • A researcher could measure so many things that some might be significant by chance
  • fMRI voxels
  • EEG electrodes
  • Lots of conditions in an experiment
  • A researcher could conduct different types of analyses on the same data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pre-Registration

A

• You determine many things about your study, before running it, and register this with a specially made tool

Benefits
- Scientific
• Can not fiddle with data or hypotheses once data have been collected
• You will be spotted if you keep running the same study
• Journals can accept a publication based on pre-registration before the data are collected, avoiding the file drawer problem

  • Organisational
    • We know exactly what analysis to do – might take months / years to collect data so its easy to forget
    • Really understand all elements of your study and can then make sure your study is going to be the best it can be
    • Read Andrews & Justice ‘Replication crisis’ chapter in Essential Psychology textbook for more info.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Other methods

A
  • Grant-funded research
  • Studies get evaluated and reviewed by experts in the grant proposal
  • Many similar elements to pre-registration such as choosing analyses and sample numbers
  • Higher budgets
  • More reliable results as more participants or more time to spend developing measures
  • Multiple-experiment papers
  • Get an ‘interesting’ result in Experiment 1, repeat and develop it in Experiments 2, 3 etc
  • Bonus: Often deeper theoretical insight through replication of a slightly adapted Experiment 1 study
  • Publishing data sets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Pre-registration steps

A
  • Hypothesis/es
  • DV what we are measuring and how we will measure it
  • Conditions: how many and how participants will be assigned
  • What model (stats) will you use
  • How might we handle outliers and what exclusion criteria might we use?
  • Sample size
  • Any other secondary or exploratory analyses
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Hypothesis/es

A
  • What model (stats) will you use
  • Sample size
  • Other considerations
  • How will you deal with outliers
  • Will you explore other parts of the data without a hypotheses
  • Before you even start thinking about data analysis, you need two things:
  • Clear research questions
  • Clear statements about how the manipulations in your experiment will affect the measure (hypotheses)
  • Without these, you won’t know what you are studying or why
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Determine the appropriate model

A
  • Before we even collect our data, we should have a clear idea of what statistical test we are going to run
  • Should not be deciding this once data are collected
  • What if no model suitable for the data
  • What if data not in correct format, or not enough conditions etc
  • Can lead to ‘fishing’ around in data to find results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Run a sample size estimation

A
  • How many people should you test?
  • You learned about power and sample size a few weeks ago
  • Need to use sample size estimation to determine how many people we need to detect the effect size we are interested in
  • With too few people, we might not detect an effect that exists
  • E.g., We might only have power to detect the largest effects – what if our effect is small?
  • Waste of resource (time, money, effort)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Other considerations

A
  • Define how incorrect responses and outliers will be determined
  • What would lead to exclusion of a participant
  • X incorrect responses
  • What would lead to exclusion of a trial
  • 2.5 SD higher/lower than the mean
  • Extra fast/slow responses
  • What other things might you do with the data, if exploratory analyses are conducted these will be flagged up in the pre-registration to avoid cherry picking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Pre-Registration and Organisation

A
  • Many factors make running projects complicated
  • Researchers usually have many projects running in parallel
  • Small Projects
  • Grants
  • PhD Students
  • Summer Projects/Volunteer Projects
  • 3rd Year Projects/Masters Projects
  • Projects run for multiple years (e.g., 3 months to get reviewed by a journal, 3 year PhDs)
  • All projects similar due to area of expertise
  • Multiple people working on each project
  • Crucial to organise materials, data and analyses so
  • They can be revisited months/years down the line
  • Different team members can understand the data and analyses
  • Pre-registration can act as a large part of this organisation
  • Publishing data can also help keep it organised
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

overview

A
  • Scientific Idea
  • Pre-register hypotheses, methods and planned analyses
  • Organise your materials, data analyses
  • Run the study as planned, report any changes from pre-registered plans
  • Write up (another story, see other aspects of the course)
  • Publish anonymised data where possible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

get descriptive stats (R STUDIO)

A

Describe(data, mean = mean(dataset), stdev = sd(dataset))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

^^ arrange the descriptive stats

A

Describe(data, by dataset, mean = mean(dataset), stdev = sd(dataset))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

^^ by conditon

A

• describe(data =data, mean_dataset = mean(dataset), SD_dataset = sd(dataset), max_dataset = max(dataset), min_dataset = min(Intrusion), by = Condition)

17
Q

example r studio

A

describe(data =tetris, mean_intrusion = mean(Intrusion), SD_Intrusion = sd(Intrusion), max_Intrusion = max(Intrusion), min_Intrusion = min(Intrusion), by = Condition)