Statistics And Data Analysis 21061 Flashcards

1
Q

What is the difference between categorical level data and continuous data

Week 1

A

Catergorical data is nominal only (numbers, names gender only) whereas as continious data can be put on a continious scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What two descriptive statistics do we typically use

Week 1

A

Central tendency & spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between how independent variables and dependent variables are measured

Week 1

A

The IV is ALWAYS measured on a categorical scale
The DV is IDEALLY measured on a discrete/continious scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the benefit of measuring the DV on a continious scale

Week 1

A

So that we can use parametric statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between a true-experimental vs a quasi-experimental design

Week 1

A

We actively manipulate the IVs in a true experimental design whereas the IVs in a quasi experimental design reflect fixed characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Is handedness a quasi or true experimental IV

Week 1

A

Quasi - it is a fixed characteristic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 3 main types of subject design

Week 1

A

Between subjects, within subjects, mixed design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a (2^ 3) mixed design

Week 1

A

Has two IVs, one between, one within.
Between IV has two levels, within IV has 3 levels
(e.g males and females preferences to horror, action and romance movies)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does normally distributed data allow us to do

Week 1

A

Use parametric stats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the properties of normally distributed data

Week 1

A

Symmetrical about the mean
Bell shaped - mesokurtic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Platykurtic data

Week 1

A

Data which has more variations/spread than normally distributed data
(-ve kurtosis value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is leptokurtic data

Week 1

A

Data which has less variations/spread than normally disributed data (+ve kurtosis value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What type of skew does normal data have

Week 1

A

normally distributed data has no skew

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is sampling error

Week 1

A

degree to which sample statistics differ from underlying population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are Z scores

Week 1

A

converted scores from normally distributed populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is sampling distribution

Week 1

A

Distribution of a stat across an infinite number of samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the sampling distribution of the mean

Week 1

A

Distribution of all possible sample means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are standard error (SE) and estimated standard error (ESE)

Week 1

A

Standard deviation of sampling distribution

ESE is simply an estimate of the standard error based on our sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do we use sample statistics for

Week 1

A

to estimate the population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a T-test

Week 2

A

Inferential statistic when we have 1 IV and 2 DVs that estimates whether population means under 2 IV levels are different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What contributes to variance between IV levels in an independent t-test

Week 2

A
  • manipulation of IV (treatment effects)
  • individual differences
  • experimental error
    * random error
    * constant error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what contributes to variance within IV levels in an independent t-test

week 2

A

individual differences
random experimental error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What would happen if we continued to determine the mean of the difference for infinite samples

Week 2

A

it would essentially be like calculating the population mean difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the null hypothesis when talking about sampling distribution of differences

Week 2

A

the sampling distribution of differences will have a mean of 0 as there is no difference between the sample means of 2 different samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Why do we use estimated standard error instead of standard deviation in T-distribution

Week 2

A

Because it is a sampling distribution, instead of s.d we use s.e. This is because standard error is used to express the extent an individual sample mean difference deviates from 0

As we do not have all of the possible samples to calculate the standard error, we estimate the standard error , hence why we use e.s.e

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the equation for t in an independent design

Week 2

A

Xd/ESEd
AKA
Mean of the difference / estimated standard error of the difference
AKA
variance between IV levels/variance within IV levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What does the distance to 0 of the t value indicate?

Week 2

A

If t value is closer to 0, smaller variance between IV levels relative to within

If t value is further from 0 , large variance between IV levels relative to within IV levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does it mean if the null hypothesis is true for t-dist

Think CI

week 2

A

If the null hypothesis is true - 95% of sampled t-values will fall within the 95% bounds of the t-dist

If the null hypothesis is true, only 5% of sampled t-values will fall outside the 95% bounds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are degrees of freedom and how are they calculated

Week 2

A

the differences between the number of measurements (sample size) made & number of parameters estimated (usually one, mean)

(Sample size - # of parameters)
N-2 for independent t-test
n-1 for paired t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What happens to the degrees of freedom value the larger they get

Week 2

A

They tend to 1.96, the original value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are some of the assumptions we make for an independent t-test

Week 2

A
  • Normality: the DV should be normally distributed under each level of the IV
  • Homogeneity of variance: The variance in the DV, under each level of the IV should be reasonably equivalent
  • Equivalent sample size: sample size under each level of IV should be roughly equal ( matters more with smaller samples)
    * Independence of observations: scores under each level of the IV should be independent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What test do we use when the asumptions for the independent t-test are violated

Week 2

A

we use the non-parametric equvalent: Mann-Whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is Levenes test

Week 2

A

A test for equality of variance –> homogeneity of variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

what does levenes test tell us and what does it not tell us

Week 2

A

Tells us: Whether theres a diff in variances under the IV levels
doesn’t tell us:if our means are different or IV manipulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is the null hypothesis of levenes test

Week 2

A

no diff between the variance under each level of the IV (i.e homogeneity in variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

If we reject Levene’s test, what does this mean

Week 2

A

There is heterogeneity in variance - the way in which the data varies under both IVs is different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What assumptions do we want when it comes to variance between IV levels?

Week 2

A

equal variance and homogeneity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What contributes to variance between IV levels in a paired t test

Week 2

A
  • Manipulation of IV (treatment effects)
  • Experimental error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

what contributes to variance within IV levels in a paired t test

A

Experimental error

(RM designs - can discount the variance due to individual differences (leaving only variance due to error))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What assumptions do we make during a paired t-test

A
  • Normality - distribution of difference scores between the IV levels should be approximately normal
    * Assume ok if n> 30
  • Sample size - sample size under each IV level should be roughly equal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What do we do when our assumptions are violated during a paired t-test

Week 2

A

We use the non-parametric equivalent - Wilcoxon test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

How do we interpret 95% Confidence intervals for repeated measure designs

Week 2

A

we can’t determine if result is likely to be significant by looking at 95% CI plot therefore we need to look at the influence of the IV in terms of size & consistency of effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

For a repeated measures design, what would happen if the confidence intervals cross 0 (lower value is negative and higher value is positive)

Week 2

A

you cannot reject the null hypothesis as you cannot conclude that the true population mean difference is different from 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is Cohen’s D

Week 2

A

The magnitude of difference between two IV level means, expressed in s.d units
I.e - a standardised value expressing the diff between the IV level means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What are the values for effect size of Cohen’s d

week 2

A

Effect size d
Small 0.2
Medium 0.5
Large 0.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

How does cohen’s d differ from T? Define both.

week 2

A

D = magnitude of difference between two IV level means, expressed in s.d units
T = magnitude of diff between two IV level means, expressed in ESE units

T takes sample size into account - qualifies the size of the effect in the context of the sample size .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

When do we use a One way anova

Week 3

A

When we have 1 IV with more than 2 levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What does a one way anova do?

Week 3

A

Estimate whether the population means under the diff the levels of the IV are different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is an ANOVA like (think of t-tests)

Week 3

A

an extension of the t-test –> if you conducted a one-way anova on an IV w/ 2 levels, you’d obtain the same result (F = t^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Why do we use ANOVA instead of running multiple t-tests

Week 3

A

the more we draw from a population, the more likely we are to encounter a type I error and reject the null hyothesis, even if it true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What is the familywise error rate and what does ammedning it provide

Week 3

A

Probability that at least one of a ‘family’ of comparisons run on the same data, will result in a type I error

Provides a corrected significance level (a) reducing the probability of making a type I error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

How do calculate the familywise error rate ?

Week 3

A

a’ = 1 - (1- a)^c

where c is the number of comparisons

e.g for 3 IV levels (3 comparisons) (ab ac bc)
1 - (1 - 0.05) ^3 = .143 = 14% chance of type I error

for 4 IV levels (6 comparisons (ab ac ad bc bd cd) )
1 - (1 - 0.05)^6 = .264 = 26% chance of type 1 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Why do we use omnibus tests?

Week 3

A

To control familywise error rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What is the null hypothesis of the F ratio/ANOVA?

Week 3

A

there is no difference between populations means under different levels of IV

H0:u1=u2=u3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

what is the ratio for the F value.

Week 3

A

Variance between IV levels/ Variance within IV levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What does the closeness of the F value to 0 indicate

Week 3

A

F value close to 0 = small variance between IV levels relative to within IV levels
F Value further from 0 = large variance between IV levels relative to within IV levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What assumptions do we make for an independent one way ANOVA

Week 3

A

Same as those for independent T-test
Normality: DV should be normally distributed, under each level of the IV
Homogeneity of variance : Variance in the DV, under each level of the IV, should be (reasonably) equivalent
Equivalent sample size : sample size under each level of the IV should be roughly equal
Independence of observations : scores under each level of the IV should be independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What do we do when the assumptions of the independent one-way anova aren’t met?

Week 3

A

We use the non-parametric equivalent, the Kruskal Wallis test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

1.

What is the model sum of squares?

Equation

Week 3

A

Model Sum of Squares (SSM): sum of squared differences between IV level means and grand mean (i.e. between IV level variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What is the residual sum of squares?

Week 3

A

Residual Sum of Squares (SSR): sum of squared differences between individual values and corresponding IV level mean (i.e. within IV level variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

What is SSt and how is it calculated

Week 3

A

Sum of squares total
= SSm( Sum of squares model ) + SSr (Sum of squares residual)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

What is the mean square value and how is it calculated? What are the two types?

Week 3

A

MS = SS/df (Sum of squares/ degrees of freedom)
MSm = model Mean square value
MSr = residual mean square value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

What do we use mean square values for?

Week 3

A

To calculate the F statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

How do we calculate the F statistic

mean square values

Week 3

A

MSm/MSr
aka
model mean square value / residual mean square value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

What do we do when the assumption of homogeneity is violated in an independent 1-way ANOVA

Week 3

A

We report Welch’s F instead of ANOVA F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

What happens to the degrees of freedom when we use Welch’s F?

Week 3

A

The degrees of freedom are adjusted (to make the test more conservative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

How is the ANOVA F value reported

Week 3

A

F(dfm,dfr)=F-value, p =p-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

How do we calculate degrees of freedom for an independent 1 way ANOVA

Week 3

A

find the difference between the number of measurements and the number of parameters estimated

i.e. no. of measurements – no. parameters estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

How do we calculate df for between IV level (model) variance where N is total sample size and k is number of IV levels

Week 3

A

K-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

How do we calculate df for within IV level (residual) variance where N is total sample size and k is number of IV levels

Week 3

A

N-k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

What are post hoc tests

Week 3

A

Secondary analyses used to assess which IV level mean pairs differ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

When do we use post-hoc tests

Week 3

A

only when the F-value is significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

How do we run post-hoc tests?

Week 3

A

As t-tests, but we include correction for multiple comparisons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

what are the 3 type of post-hoc test

Week 3

A
  • Bonferroni
  • least significant difference (LSD)
  • Tukey honestly significant difference (HSD)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Which post hoc test has a very low Type I error risk, very high type II error risk and is classified as ‘very conservative’

week 3

A

Bonferroni

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Which post-hoc test has a high type I error risk, a low type II error risk and is classified as ‘liberal’

A

Least significant difference (LSD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Which post-hoc test has a low type I error risk , a high type II error risk and is classified as ‘reasonably conservative’

week 3

A

Tukey Honestly significant difference (HSD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

What are the three levels of effect size for partial eta^2 for ANOVA

week 3

A

> 0.01 is small
0.06 is medium
0.14 is large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

what is effect size measured in for ANOVA

A

calculated in 2 ways, cohens d and partial eta squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

How do you calculate partial eta squared

week 3

A

Model sum of squares/ (model sum of squares + residual sum of squares)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

In a repeated measures design for a one way ANOVA, what contributes to variance between IV levels

Week 4

A
  • Manipulation of IV (treatment effects)
  • Experimental error (random & potentially constant error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

In a repeated measures design for one way ANOVA, what contributes to variance within IV levels

Week 4

A

Experimental error (random error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

**

how do we calculate total variance?

Week 4

A

Model variance(variance between IV levels)/ residual variance (variance within IV levels) -
Individual differences (in independent designs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

what is the t/F ratio and how do we calculate it?

Week 4

A

variance between IV levels/ variance within IV levels (excluding variance due to individual diffs WHEN IN RM design)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

how is the F ratio calcated in terms of Mean square values

Week 4

A

Mean sum of squares model/ mean sum of squares residual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

What are the 3 assumptions made in a repeated measures 1-way ANOVA

Week 4

A
  • Normality - distribution of difference scores under each IV level pair should be normally distributed
  • Sphericity (homogeneity of covariance) - the variance in difference scores under each IV level pair should be reasonably equivalent
  • Unique to RM 1-way anova
  • Equivalent sample size: sample size under each level of the IV should be roughly the same
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

What corrects for the sphericity assumption.

Week 4

A

Greenhouse-geisser

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

What test do we do to check for sphericity and what is its respective value?

Week 4

A

Mauchly’s test & the W value

89
Q

What is the null hypothesis of the assumption of sphericity in the repeated measures ANOVA

Week 4

A

There is no difference between the covariances under each IV level pair (i.e homogeneity)
If p ≤ .05 we reject null hypothesis (i.e heterogeneity)

90
Q

What do we do if our data seriously violates the assumptions of a repeated measures One-way ANOVA

Week 4

A

we should use the non-parametric equivalent - Friedman test

91
Q

If Mauchlys is significant, what do we use in SPSS output

Week 4

A

The row that labelled Greenhouse-geisser as sphericity cannot be assumed

92
Q

If Mauchlys is not significant, which row do we use in SPSS output?

Week 4

A

The row labelled sphericity assumed

93
Q

**

How do we report the F statistic in repeated measures ANOVA

Week 4

A

F(dfM,dfR) =F-value p = p value (greenhouse-geisser/sphericity assumed)

94
Q

How do we calculate the degrees of freedom for RM 1-way anova

for model and residual

week 4

A

dfM = K -1(where K number of IV levels/ parameters)
dfR = dfM x (n-1) (where n = number of participants)

95
Q

Which post Hoc test do we use for RM 1 way anova

week 4

A

bonferroni

96
Q

why is recruitment an advantage of repeated measures designs

Week 4

A

needs fewer p’s to gain same number of measurements

97
Q

How does the model of repeated measures design cause error variance to be reduced and why is this advantageous

Week 4

A

Remove variance due to individual differences from error variances –> leading to less variance within IV levels

98
Q

Apart from recruitment and reduction in error variance,what is another advantage of repeated measures designs

A

There is more power with the same amount of participants
* its easier to find a significant difference ( and avoid type II error)

99
Q

what are order effects and what effect can they have on repeated measure designs

Week 4

A

They are the effects of having the participants go through the same thing in different conditions and becoming habituated to it in a variety of different ways.

They introduce confines - error introduced systematically between IV levels

100
Q

What are the 4 types of order effects

Week 4

A
  • Practice effets
  • fatigue
  • sensitisation
  • carry-over effects
101
Q

What are practice effects in terms of order effects

Week 4

A

P’s get better at the task which positively skews how they do in subsequent IV levels

102
Q

What is fatigue in terms of order effects

Week 4

A

Participants get bored/ tired of engaging which negatively skews how they do in subsequent tasks

103
Q

What is sensitisation in terms of order effects

Week 4

A

P’s start behaving in a particular way to please or annoy the experimenter due to understanding IV manipulation

104
Q

What are carry over effects in terms of order effects

Week 4

A
  • effect of taking part in one IV level effects how one acts on subsequent IV levels
105
Q

What is counterbalancing and how is it used to minimise order effects

Week 4

A

counterbalancing what order people undergo the IV levels go through must be done to ensure as much randomness as possible, this does not get rid of order effects, but spreads their impact

106
Q

What are alternatives for each type of order effect when counterbalancing is not possible (4)

week 4

A

– Practice - extensive pre-study practise
– Fatigue - short experiments
– Sensitisation - intervals between exposure to IV levels
– Carry-over effects - include a control group

107
Q

When do we use factorial ANOVAs

Week 5

A

to test for differences when we have more than one IV with at least 2 levels

108
Q

What are the 3 broad factorial ANOVA designs

Week 5

A
  • all IVs are between-subjects (independent)
  • all IVs are within-subjects (repeated measures)
  • a mixture of between-subjects and within-subjects IVs (mixed)
109
Q

what would a 2 * 2 ANOVA mean

Week 5

A

2 IVs/factors, each with 2 levels

110
Q

what would a 2 * 4 ANOVA mean

week 5

A

2 IVs/factors, one with 2 levels and one with 4 levels

111
Q

What are the three type of main effects we would be looking for in a 2 * 3ANOVA design if the primary IV is gender (male female) and the secondary IV is colour (red, white and blue)

A
  • is there a significant main effect of gender
  • is there a significant main effect of colour
  • is there an significant main interaction between gender and colour?
112
Q

If we are doing a study to try and see whether there is a difference between how much men and women like chocolate, and we are also looking to see whether the texture of the chocolate (chunks vs tablets) has an effect, what is the primary IV, what is the secondary IV, why are they respectivley so andwhat do these terms mean?

Week 5

A

The primary IV is gender, the secondary IV is texture. Gender is the primary IV as it is the IV main IV we are looking for an effect for. Texture is the secondary IV as we are looking to see if the addition of this variable also creates an effect, hence it being secondary because it is not the focus.

113
Q

In a between subjects 2 * 3 ANOVA, how many possible conditions are there?

Week 5

A

6

114
Q

What is the null hypothesis for Factorial ANOVAand how many are there?

Week 5

A

There is one per IV and one for each possible interaction IV pair.
e.g in 2 * 2 ANOVA , there is a null hypothesis of no difference in means for IV one, one for IV two and one for the interaction between IV one and IV two

115
Q

What does a significant interaction indicate in ANOVA?

Week 5

A

that the effect of manipulating one IV depends on the level of the other IV

116
Q

What is an interaction in terms of ANOVA.

Week 5

A

The combined effects of multiple IVs/factors on the DV

117
Q

What are Marginal means used for in ANOVA

Week 5

A

to determine if there is significant effect for either IV

118
Q

In an ANOVA line chart, what does it mean if the lines for the IVs are parallel

Week 5

A

There is no interaction of the two IVs

119
Q

What does it mean if the marginal mean of one of the IVs is at roughly the same level as the means for both populations

Week 5

A

there is no main effect

120
Q

What are the assumptions made in an independent factorial (two way) ANOVA (5)

Week 5

A

Normality: DV should be normally distributed, under each level of the IV
Homogeneity of variance : Variance in the DV, under each level of the IV, should be (reasonably) equivalent
Levennes - DON’T want a significant result
NO correction
Equivalent sample size : sample size under each level of the IV should be roughly equal
Independence of observations : scores under each level of the IV should be independent

121
Q

What is the non-parametric equivalent for the Independent factorial ANOVA

Week 5

A

There is no non-parametric equivalent for factorial ANOVA
If our data seriously violate these assumptions we can attempt a ‘fix’ or we can simplify the design

122
Q

How many F statistics do we report in factorial ANOVA

Week 5

A

one for each IV i.e the main effect for each IV

123
Q

What is the difference between classical eta squared and partial eta squared

Week 5

A

Classical eta^2 : proportion of total variance attributable to factor

Partial eta^2: Only takes into account variance from one IV at a time
(Proportion of total variance attributable to the factor, partialling out/excluding variance due to other factors)

124
Q

when do we use Post Hoc tests

Week 5

A

If the main effect of at least one of the IVs is significant, then we reject the null hypothesis
**Only relevant when **
* main effect of IV is significant & IV hs more than 2 levels

125
Q

For one-way ANOVA what do we report alongside post hoc results

Week 5

A

Cohens D

126
Q

For factorial ANOVA what do we report alongside post hoc results

Week 5

A

nothing, we dont report Cohens d

127
Q

What are simple effects in terms of interaction effects and how do we check them

Week 5

A

effect of an IV at a single level of another IV
* * do compairsosn of cell mean conditions (i.e t-tests)

128
Q

For an IV with a between subjects design, how do we check for simple effects

Week 5

A

we do independent t-test for each comparison

129
Q

What is the bonferroni correction and what is it the calculation is performs?

Week 5

A

a correction that divides the required alpha level by the number of comparions (e.g for 6 comparisons , .05/6 = .008)

130
Q

How can ANOVAs in general be described as

Week 7

A

a flexible and powerful technique appropriate for many experimental designs

131
Q

What questions are necessary to ask before collecting any data and performing an ANOVA

Week 7

A

*Do I have a clear research question?
*Do I know what analyses I will need to conduct to answer this?
*Will I be able to carry out and interpret the results of these analyses?
*Have I considered and controlled for potential confounds?
*Will I understand the answer I get?

132
Q

What does our choice of statistical test depend on

Week 1

A
  • Scale of measurement
  • Research aim
    * Descriptive only
    * Relational (relationships)
    * Experimental (differences)
  • Experimental design
    * Subject design: between/within
    *Number of IV’s
    * Number of IV levels
  • Properties of dependent/outcome variable
    *Normally distributed: parametric
    * Not normally distributed: non parametric
133
Q

What do descriptive statistics not allow us to do

Week 1

A

Make predictions or infer causality

134
Q

What does a 95% confidence interval mean

Week 1

A

95% of all sampled means will fall within the 95% bound of the population mean

135
Q

When writing proportions (such as partial eta squared) what is the correct notation for them?

General

A

you drop the leading zero and report it to 3dp

136
Q

What can relationships vary in

Week 8

A

Form, Direction ,Magnitude/strength

137
Q

What are the two types of form a relationship can take

Week 8

A

linear or curvilinear

138
Q

What are the two directions a relationship can go in

week 8

A

positive or negative

139
Q

What is the magnitude/strength of a relationship measured in

Week 8

A

The R value

140
Q

What R values are indicative of a perfect positive relationshp, a perfect negative relationship and no relationship

Week 8

A

1, -1 & 0

141
Q

What does a an r value of 0 look like on a scatter graph

Week 8

A

The dots are random and there is no systematic relationship

142
Q

What are the values for weak, moderate and strong correlation

Week 8

A

± 0.1 - 0.39 = weak correlation
± 0.4 - 0.69 = moderation correlation
± 0.7 - 0.99 = strong correlation

143
Q

What is meant by non-linear correlation?

Week 8

A

The idea that some DV’s peak at a certain point of an IV

(e.g confidence in ability to pass course, too low = do worse, too high = do worse, at optimum = do best)

144
Q

what does bivariate linear correlation involve

Week 8

A

Linear correlation involves measuring relationship between 2 variables measured in a sample
We use sample stats to estimate population parameters -whole logic of inferential statistical testing

145
Q

What is the null hypothesis when doing a bivariate linear correlation?

Week 8

A

no relationship between population variables

146
Q

What parametric assumptions do we have when doing a bivariate linear correlation? (4)

Week 8

A
  • Both variables should be continious
  • Related pairs: each P (or observation) should have a pair of values (one for each axis/IV)
  • absence of outliers: outliers skew results, we can usually just remove them
  • linearity: points in scatterplot should be best explained w/ a straight line
147
Q

Apart from the parametric assumptions, what other things are important to consider in regards to Correlation and correlation coefficients

Week 8

A

they are sensitive to range restrictions
* E.g floor and ceiling effects - floor effect, clustering of scores at bottom of scale, ceiling effect = clustering at top of scale
* Can be hard to see relationship between variables as you dont see how far they stretch due to cap

There is debate over likert scales,
if you have 6-7 points, can get away with parametric, if you have less, best to use non-parametric

148
Q

What happens if our data seriously violates our parametric assumptions for a correlation coefficient test?

Week 8

A

use non-parametric equivalent **Spearman’s rho (or kendall’s Tau if fewer than 20 cases)

149
Q

What does Pearson’s correlation coefficient do, and what does it’s outcome show?

Week 8

A
  • Investigates relationship between 2 quantitative continuous variables
  • Resulting correlation coefficient ( r ) is a measure of strength of association between the two variable
150
Q

What is covariance

Wee k 8

A

Variance between the x and Y variable

151
Q

How do you calculate Covariance? (we will never have to do this by hand but good practice to know)

The process

Week 8

A
  1. For each datapoint, calculate diff from mean of X and difference from mean of Y
  2. Multiply the differences
  3. Sum the multiplied differences
  4. Divide by N-1
152
Q

What does the correlation coefficient of pearson’s provide us with and what actually is it?

Week 8

A

a measure of variance shared between our X and Y variables
it is a ratio of covariance (the shared variance) to separate variances

153
Q

What does the distance of the r value in relation to 0 mean in regression?

Covariance and variances

Week 8

A

If covariance is large relative to separate variances - r will be further from 0
If covariance is small relative to the separate variances - r will be closer to 0

If the things (variables) tend to go up and down together a lot (large covariance), the correlation (r) will be far from 0, indicating a strong relationship.

If the things don’t move together much (small covariance), the correlation will be closer to 0, indicating a weaker relationship.

154
Q

What does R tell us in terms of a scatter graph? - How does the spread of the data points relate to R?

Week 8

A

how well a straight line fits the data points (i.e strength of correlation → strength is about how tightly your data points fit on the straight line )
If data points cluster closely around the line, r will be further from 0
If data points are scattered some distance from the line, r will be closer to 0

155
Q

What difference reflects sampling error?

Week 8

A

The fact that if you took two samples from the same populations you’re likely to get two different R values.

156
Q

If we did the sampling distribution of correlation coefficients what would the null hypothesis be

Week 8

A

if we plotted the R values, the majority would cluster around a common point,the true populaion mean.

157
Q

What would the null hypothesis be for the sampling distribution of correlation coefficients

Week 8

A

The mean would be 0 thus most R values would cluster close to 0

158
Q

What is the r-distribution, what does it tell us and what is its mean value?

Week 8

A
  • It is the extent to which an individual sampled correlation coefficient (r) deviates from 0 which can be expressed in standard error units
  • we can determine the probability of obtaining an r-value of a given magnitude when the null hypothesis is true (p-value)
  • the mean is 0
159
Q

What is the relationship between the R-value and the population

Week 8

A

the obtained r-value is a point estimate of the underlying population r-value

160
Q

When is linear regression used, and what is it’s purpose?

Week 9

A
  • Similarly to linear correlation, it is used when the relationship between variables x & y can be described with a straight line
  • by proposing a model of the relationship between x & y, regression allows us to estimate how much y will change as a result of given change in x
161
Q

What is the Y variable in linear regression?

Week 9

A

The variable that is being predicted –> the outcome variable

162
Q

What is variable X in linear regression and what is special about it

A

The variable that is being used to predict –> The predictor variable
**can have Multiple predictor variables **

163
Q

What is regression used for? (3)

Week 9

A
  • Investigating strength of effect x has on y
  • Estimating how much y will change as a result of a given change in x
  • Predicting a value of y, based on a known value of x
164
Q

What assumption is made in regression that is not done in correlation and what does this mean in regards to what evidence can be obtained from regression?

Week 9

A

Regression assumes that Y (to some extent) is dependent on X, this dependence may or may not reflect causal dependency.
This therefore means regression does not provide direct evidence of causality

165
Q

Does a significiant regression infer causality?

Week 9

A

No, other factors other than our used predictor variables may come in to effect, thus can’t suggest causality.

166
Q

What are the 3 stages of performing a linear regression?

Week 9

A
  1. analysing the relationship between variables
  2. proposing a model to explain the relationship
  3. evaluating the model
167
Q

What does ‘ analysing the relationship between variables’ mean as a stage during linear regression?

Week 9

A

Determining the strength & direction of the relationship

168
Q

What kind of model is being proposed in linear regression and what is expected of this model?

Week 9

A

a line of best fit where the distance between the line and the individual datapoints is minimised as much as possible

169
Q

Ideally, for a line of best fit, where should the datapoints be relative to it

Week 9

A
  • half above, half below line
  • clustered as close as possible to line (signifies strong relationship)
  • distance is minimised as much as possible
170
Q

What are the 2 properties of a regression line?

Week 9

A
  • The intercept: value of y when x is 0 (typically the baseline) (a value)
  • The slope: how much y changes as a result of a 1 Unit increase in x (the gradient) (b value)
171
Q

When ‘evaluating the model’ , what are we doing and how do we do this

Week 9

A

Assessing the goodness of fit of our model (best model/line of best fit) vs the simplest model (b=0, comparing data points to the mean of y)

172
Q

What is the simplest model?

Week 9

A
  • Using the average Y value (mean) to estimate what Y might be
  • **assumes no relationship between x and y (b=0) **
173
Q

What is the ‘best model’? (What is it based on, what functions can it serve?)

Week 9

A
  • based on the relationship between x & y
  • uses regression line & line of best fit to determine what a value of Y would be at a particular value of X
  • allows for better predicition
174
Q

When calculating the goodness of fit your model, what is the first thing you do? What does this provide?

Week 9

A

first check how much variance remains when checking the simplest model (mean of y) to predict Y.

This provides the sum of squares total (diff between each data point & mean value, & squaring it and summing them )

175
Q

How do you calculate the variance not explained by the regression line and what does this give you?

A

calculate difference between each data point and point on the line it matches up to (score that would be predicted), square these differences and then add them together
This gives you the sum of square of the residuals

176
Q

What does more clustering around the regression line indicate for the model?

Week 9

A

The model is providing a better model, meaning there is smaller error variance, and that the model is more accurate (about variance due to the variable in question).

177
Q

What is the sum of squares total in relation to regression?

Week 9

A

*the difference between the observed values of y and the mean of y

i.e. the variance in y not explained by the simplest model (b = 0)* ‘

178
Q

What best matches the description ‘the difference between the observed values of y and those predicted by the regression line
i.e. the variance in y not explained by the regression model

Week 9

A

Sum of squares residual

179
Q

What is reflective of the improvement in prediction using the regression model when compared to the simplest model?

Week 9

A

The difference between Sum of squares total and and sum of squares residual , in other words **the model sum of squares **
**SST - SSR = SSM **

180
Q

What does a large sum of squares value indicate in regression?

Week 9

A

a large(er) improvement in the prediction using the regression model over the simplest model

181
Q

What can we use F tests (what we call an ANOVA in cases of regression to avoid confusion) to evaluate and what is this reported as?

Week 9

A

the improvement due to the model (SSM) relative to the variance the model does not explain ( SSR)

It is reported as the F-ratio

182
Q

What does the F ratio do in goodness of fit tests and how do you calculate it?

Week 9

A
  • provides a measure of how much the model has improved the prediction of y, relative to level of inaccuracy of the model
  • F = Model mean squares / residual mean squares
183
Q

What would you expect to see in terms of model mean squares (MSM) and residual means squares (MSR) if the regression model is good at predicting y?

Week 9

A

the improvement in prediction due to the model (MSM) will be large, while the level of inaccuracy of the model (MSR ) will be small

184
Q

What are the assumptions we make for simple linear regression? (5)

Week 9

A
  • Linearity: x and y must be linearly related
  • Absence of outliers
  • Normality
  • homoscedasticity
  • Independence of residuals
185
Q

How do we check for the assumption of normality in regression models and what would we expect to see (idk if this’ll be on the exam but just know it init)

Week 9

A

Using a normal P-P plot of regression standardised residual
* Ideally data points will lie in a reasonably straight diagonal line from bottom left to top right - this would suggest no major deviations from normality

186
Q

How do we check for the assumption of Homoscedasticity in regression models and what would we expect to see

A

Using the scatterplot of regresssion standardised residual
* Ideally, residuals will be roughly, rectangularly distributed, with most scores concentrated in the centre (0)

187
Q

What do the values of R, R^2 and adjusated R^2 each tell you about regression in the SPSS output

Week 9

A
  • R - strength of relationship between x and Y
  • R^2- proportion of variance explained by the model
  • Adjusted R^2 - R^2 adjusted to account for the degrees of freedom (number of participants and number of parameters being estimated)
188
Q

Why would we use adjusted R^2

Week 9

A

-If we wanted to use the regression model to generalise the results of our sample to the population, R2 is too optimistic

189
Q

What are the key values identifying when evaluating the regression model and what do they mean? (3 values)

Week 9

A
  • a - constant, also the intercept where the line intersects Y
  • b - gradient of slope
  • beta - slope converted to a standardised score
190
Q

If there is only one predictor variable, what does this mean for the beta coefficient?

Week 9

A

Beta coefficient and R are the same value

191
Q

Why would we use a T-test in a regression model

Week 9

A
  • t-value: equivalent to √F when we only have 1 predictor variable)
  • *i.e.** it does the same job as the F-test when we have just one predictor variable**
192
Q

What additional info do we have regarding the b value in regression models?

Week 9

A

The b value has 95% confidence intervals

193
Q

What else can R^2 be interpreted as

Week 9

A

the amount of variance in y explained by the model (SSM), relative to the total variance in y (SST)

194
Q

In what ways can we express R^2

Week 9

A

as a proportion or as a percentage

195
Q

What is the fundamental difference between correlation and regression

Week 9

A

Correlation shows what variance is shared,Regression explains the variance by showing that a certain amount of the variance can be explained by the mode

196
Q

What does multiple regression allow us to do

Week 9

A

to assess the influence of several predictor variables (e.g. x1, x2, x3 etc…) on the outcome variable (y)

197
Q

How does multiple regression work (basic description)/what do you need to do in order to conduct it?

Week 9

A

Need to combine both predictor variables to see the joint effect on the outcome variable

198
Q

Why do we have to use a plane of best fit when proposing a model in multiple regression

Week 9

A

Because you’re looking at 3 things; outcome variable & predictor variables one and two , thus it will be best model in 3 dimensions instead of two, thats why we look at a plane instead of a line

199
Q

What are some of the assumptions being made multiple regression? (4)

A
  • Sufficient sample size
  • Linearity - Predictor variables should be linearly related to the outcome variable
  • Absence of outliers
  • Multicollinearity - *Ideally, predictor variables will be correlated with the outcome variable but not with one another
200
Q

What does a violaition of the assumption of multicollinearity mean? What is a way to tell if this has be violated?

Week 9

A
  • There is some overlap in the variables you are measuring for (the predictor variables might be one thing in two different terms - e.g., confidence and self-esteem are basically the same)
  • Predictor variables which are highly correlated with one another (r = .9 and above) are measuring much the same thing
201
Q

if a multiple regression model is significant what does this mean

Week 9

A
  • The regression model provides a better fit (explains more variance) than the simplest model
    * I.e at least one of the slopes is not 0 (without specifying which)
202
Q

What does hierarchical regression involve and what does this allow us to see?

Week 10

A

Hierarchical regression involves entering predictor variables in a specified order of ‘steps’ based on theoretical grounds.

This allows us to see the relative contribution of each ‘step’ (set of predictor variables) in making the prediction stronger.

203
Q

Why do we use hierarchical regression

Week 10

A
  • Examine influence of predictor variable(s) on an outcome variable after ‘controlling for’ (i.e partialling out) the influence of other variables
204
Q

When doing a hierarchical regression what is the difference between step one and two.

A

Step 1 (what you want to partial out)
Step 2 (what you want to measure) = optimism

205
Q

When looking at hierarchical regression in SPSS, what are we looking at?

Week 11

A

The row labelled Model 2. Particularly the R square change, F Change and Sig F change values. (Check SPSS, this will make sense)

206
Q

What does the sig f change column tell us in Hierarchical regression?

Week 11

A

Whether this predictor variable alone explains a significant proportion of the variance of the outcome variable

207
Q

What type of non-parametric tests are there and what are their parametric equivalents

Week 11

A
  • Between P’s - Independent T-test → Mann-whitney U Test
  • Within P’s - Paired T-test → Wilcoxon test
  • Between P’s - 1 way independent ANOVA - Kruskal Wallis test
  • Within P’s - 1 way Repeated measures ANOVA → Friedman test
208
Q

What are the non parametric tests for factorial Designs

Week 11

A

Factorial designs do not have a non parametric equivalent and either need to have a simplified design or have adjustments made

209
Q

What is the non parametric equivalent of pearsons correlation coefficient when N>20

Week 11

A

Spearmans rho

210
Q

What is the non parametric equivalent of pearsons correlation coefficient when N<20?

week11

A

Kendall’s tau

211
Q

```

What types of nonparametric test exist for tests of relationships? (2)

week 11

A

Spearmans rho and Kendallls tau are both non parametric equivalents of pearsons correlation coefficient

212
Q

What is the non parametric equivalent of partial correlation

week 11

A

Partial correlation has no non-parametric equivalent

213
Q

what is the non parametric equivalent for regression

A

Regression has no non-parametric equivalent

214
Q

What types of test do we use when analysing categorical data

Week 11

A

Chi-square (one variable or test of independence)

215
Q

What type of test is a chi-square test

Week 11

A

non-parametric

216
Q

What are the parametric equvalents of One-variable Chi-Square (a.k.a. Goodness of Fit Test) and
Chi-Square Test of Independence (two variables)

A

neither of them have parametric equivlaents, they are non-parametric only

217
Q

What is an example of an Omnibus test?

Week 3

A

An ANOVA (because they control for familywise error rate)

218
Q

How do you calculate the number of comparisons for an IV with n levels

Week 3

A

n x (n-1/2)

e.g N = 3
3((3-1)/2) = 3(2) / 2 = =6/2 = 3
e.g N = 6
6((6-1)/2) = 6((5)/2) = 30/2 = 15