Exam Revision Flashcards

(1120 cards)

1
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

There are constraints of getting valuable info for hypothesis from study’s design such as - (2)

A

duration of the study
how many people you can recruit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample?

A

A sample is the specific group that you will collect data from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a population?

A

A population is the entire group that you want to draw conclusions about.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Example of population vs sample (2)

A

Population : Advertisements for IT jobs in the UK
Sample: The top 50 search results for advertisements for IT jobs in the UK on 1 May 2020

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is inferential statistics?

A

Inferential statistics allow you to test a hypothesis or assess whether your data is generalisable to the broader population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is there a focus to do parametric tests than others in research? - (3)

A
  • they are more rigorous, powerful and sensitive than non-parametric tests to answer your question
  • This means that they have a higher chance of detecting a true effect or difference if it exists.
  • They also allow you to make generalizations and predictions about the population based on the sample data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

We can obtain multiple outcomes from the

A

same people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

We can obtain outcomes under

A

different conditions, groups or both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 4 types of outcomes we measure? (4)

A
  1. Ratio
  2. Interval
  3. Ordinal
  4. Nominal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a continous variables? - (2)

A

: there is an infinite number of possible values these variables can take on-

entities get a distinct score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

2 examples of continous variables (2)

A
  • Interval
  • Ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is an interval variable?

A

: Equal intervals on the variable represent equal differences in the property being measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Examples of interval variables - (2)

A

e.g. the difference between 600ms and 800ms is equivalent to the difference between 1300ms and 1500ms. (reaction time)

temperature (Farenheit), temperature (Celcius), pH, SAT score (200-800), credit score (300-850)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is ratio variable?

A

The same as an interval variable and also has a clear definition of 0.0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Examples of ratio variable - (3)

A

E.g. Participant height or weight
(can have 0 height or weight)

temp in Kelvin (0.0 Kelvin really does mean “no heat”)

dose amount, reaction rate, flow rate, concentration,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a categorical variable? (2)

A

A variable that cannot take on all values within the limits of the variable

    • entities are divided into distinct categories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are 2 examples of categorical variables? (2)

A
  • Nominal
  • Ordinal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is nominal variable? - (2)

A

a variable with categories that do not have a natural order or ranking

Has two or more categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Examples of nominal variable - (2)

A

genotype, blood type, zip code, gender, race, eye color, political party

e.g. whether someone is an omnivore, vegetarian, vegan, or fruitarian.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is ordinal variables?

A

categories have a logical, incremental order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Examples of ordinal variables - (3)

A

e.g. whether people got a fail, a pass, a merit or a distinction in their exam

socio economic status (“low income”,”middle income”,”high income”),

satisfaction rating [Likert Scale] (“extremely dislike”, “dislike”, “neutral”, “like”, “extremely like”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Using the term ‘variables’ for continous and categorical variables as - (2)

A

both outcome and predictor are variables

We will see later on that not only the type of outcome but also type of predictor influences our choice of stats test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Likert scale is ordinal variable but sometimes outcomes measured on likert scale are treated as - (3)

A

continuous after inspection of the distribution of the data and may argue the divisons on scale are equal

(i.e., treated as interval if distribution is normal)

gives greater sensitivity in parametric tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is measurement error?
The discrepancy between the actual value we’re trying to measure, and the number we use to represent that value.
26
In reducing measurement error in outcomes, the
values have to have the same meaning over time and across situations
27
Validity means that the (2)
instrument measures what it set out to measure refers to the accuracy of a measure (whether the results really do represent what they are supposed to measure
28
Reliability means the
ability of the measure to produce the same results under the same conditions
29
Test-retest reliability is the ability of a
measure to produce consistent results when the same entities are tested at two different points in time
30
3 types of variation (3)
* Systematic variation * Unsystematic variation * Rnadomisation
31
What is systematic variation - (2)
Differences in performance created by a specific experimental manipulation. This is what we want
32
What is Unsystematic variation (3)
Differences in performance created by unknown factors . **Age***, **Gender***,** IQ***, **Time of day***, Measurement error etc. **These differences can be controleld of course (e.g., inclusion/exclusion of pps setting age range of 18-25)**
33
Randomisation (other approaches) minimises - (2)
effects of unsystematic variation does not remove unsystematic variation
34
What is the independent variable (Factors)? ( 3)
* The hypothesised cause * A predictor variable * A manipulated variable (in experiments)
35
What is depenedent variable? (measures)- (3)
* The proposed effect , change in DV * An outcome variable * Measured not manipulated (in experiments)
36
In all experiments we have two hypotheses which is (2)
* Null hypothesis * Alternative hypothesis
37
What is null hypothesis?
that there is no effect of the predictor variable on the outcome variable
38
What is alternative hypothesis?
is that this is an effect of the predictor variable on the outcome variable
39
Null Hypothesis Signifiance Testing computes the probability the (2)
the probability of the null hypothesis being true (referred as p-value) by computing a statistic and how likely it is that the statistic has that value by chance alone Referred as p-value
40
The NHST does not compute the probability of the
null hypothesis
41
There can be directional and non-directional hypothesis of
an alternate hypothesis
42
non-directional alternate hypothesis is..
The alternative hypothesis is that this is an effect of the group on the outcome variable
43
Directional alternate hypothesis is...
The alternative hypothesis is that this the mean of the outcome variable for group 1 is larger than the mean of group 2
44
Example of directional alternate hypothesis
There would be far **greater** engagment in stats lecture if they were held at 4 PM and not 9AM
45
For a non-directional hypothesis you will need to divide your alpha value at
two ends of the tail of normal distirbution
46
The 3 misconceptions of Null Hypothesis Signifiance Testing (NHST) - (3)
1. A significant result means the effect is important 2. A non-significant result means the null hypothesis is true 3. A significant result means the null hypothesis is false (just give probability that data occured given null hypothesis, doesn't say huge evidence that null hypothesis is categorically false)
47
P-Hacking and HARKING is another issue with
NHST
48
p-Hacking and HARKINGS are the - (2)
researchers degrees of freedom cchange after results are in and some analysis has been done
49
P-hacking refers to a
selective reporting of significant results
50
Harking is
Hypothesising After the Results are Known
51
P-hacking and HARKING are often used in
combination
52
What does EMBERS stand for? (5)
1. Effect Sizes 2. Meta-analysis 3. Bayesian Estimation 4. Registration 5. Sense
53
EMBERS can reduce issues of
NHST
54
Uses of Effect sizes and Types of Effect Size (3)
* There a quite a few measures of effect size * Get used to using them and understanding how studies can be compared on the basis of effect size * A brief example: Cohen’s d
55
Meaning of Effect Size (2)
Effect size is a quantitative measure of the magnitude of the experimental effect. The larger the effect size the stronger the relationship between two variables.
56
Formula of Cohen's d
57
What is meta-analysis?
Meta-analysis is a study design used to systematically assess previous research studies to derive conclusions about that body of research
58
Meta-analysis brings together.. and assesses (2)
* Bringing together multiple studies to get a more realistic idea of the effect * Can assess effect siz that are averaged across studies
59
Funnel plots in meta-analysis can be made to..... values stuides.... (2)
investigating publication bias and other bias in meta-analysis values studies by their sample size and observe bias
60
Bayesian approaches capture
probabilities of the data given the hypothesis and null hypothesis
61
Bayes factor is now often computed and stated alongside
conventional NHST analysis (and effect sizes)
62
Registration is where (5)
* Telling people what you are doing before you do it * Tell people how you intend to analyze the data * Largely limits researcher degrees of freedom (HARKING p-hacking) * A peer reviewed registered study can be published whatever the outcome * The scientific record is therefore less biased to positive findings
63
Sense is where (4)
* Knowing what you have done in the context of NHST * Knowing misconceptions of NHST * Understanding the outcomes * Adopting measures to reduce researcher degrees of freedom (like preregistration etc..)
64
most of the statistical tests in this book rely on having data measured
at interval level
65
To say that data are interval, we must be certain that equal intervals on the scale represent
equal differences in the property being measured.
66
The distinction between continous and discrete variables can often be blurred - 2 examples- (2)
continuous variables can be measured in discrete terms; we measure age we rarely use nanoseconds but use years (or possibly years and months). In doing so we turn a continuous variable into a discrete one treat discrete variables as if they were continuous, e.g., the number of boyfriends/girlfriends that you have had is a discrete variable. However, you might read a magazine that says ‘the average number of boyfriends that women in their 20s have has increased from 4.6 to 8.9’
67
a device for measuring sperm motility that actually measures sperm count is not
valid
68
Criterion validity is whether the
instrument is measuring what it claims to measure (does your lecturers’ helpfulness rating scale actually measure lecturers’ helpfulness?).
69
The two sources of variation that is always present in independent and repeated measures design is
unsystematic variation and systematic variation
70
effect of our experimental manipulation is likely to be more apparent in a repeated-measures design than in a
between-group design,
71
effect of experimental manipulation is more apparent in repeated-design than independent since in independent design,
differences between the characteristics of the people allocated to each of the groups is likely to create considerable random variation both within each condition and between them
72
This means that, other things being equal, repeated-measures designs have more power to
, repeated-measures designs have more power to d
73
We can use randomization in two different ways depending on whether we have an
independent and repeated design measure
74
Two sources of systematic variation in repeated design measure - (2)
* Practice effects * Boredom effects
75
What is practice effects?
Participants may perform differently in the second condition because of familiarity with the experimental situation and/or the measures being used.
76
What is boredom effects?
: Participants may perform differently in the second condition because they are tired or bored from having completed the first condition.
77
We can ensure no systematic variation between conditions in repeated measure is produced by practice and boredom effects by
counterbalancing the order in which a person participates in a condition
78
Example of counterbalancing
we randomly determine whether a participant completes condition 1 before condition 2, or condition 2 before condition 1
79
# * What distribution is needed for parametric tests?
A normal distribution
80
The normal distribution curve is also referred as the
bell curve
81
Normal distribution is symmetrical meaning
This means that the distribution curve can be divided in the middle to produce two equal halves
82
The bell curve can be described using two parameters called (2)
1. Mean (central tendency) 2. Standard deviation (dispersion)
83
μ is
mean
84
σ is
standard deviation
85
Diagram shows:
e.g., If we move 1σ to the right then it contains 34.1% of the valeues
86
Many statistical tests (parametric) cannot be used if the data are not
normally distributed
87
The mean is the sum of
scores divided by the number of scores
88
Mean is a good measure of
central tendency for roughly symmetric distributions
89
The mean can be a misleading measure of central tendency in skewed distributions as
it can be greatly influenced by scores in tail e.g., extreme values
90
Aside from the mean, what are the 2 other measured of central tendency? - (2)
1. Median 2. Mode
91
The median is where (2)
the middle score when scores are ordered. the middle of a distribution: half the scores are above the median and half are below the median.
92
The median is relatively unaffected by ... and can be used with... (2)
* extreme scores or skewed distribution * can be used with ordinal, interval and ratio data.
93
The mode is the most
frequently occurring score in a distribution, a score that actually occurred
94
The mode is the only measure of central tendency that can be used with
with nominal data
95
The mode is greatly subject to
sample fluctuations and is therefore not recommended to be used as the only measure of central tendency
96
Many distributions have more than one
mode
97
The mean, median and mode are identical in
symmetric distribtions
98
For positive skewed distribution, the
mean is greater than the median, which is greater than the mode
99
For negative skewed distribution
usually the mode is greater than the median, which is greater than the mean
100
Kurtosis in greek means
bulge or bend in greek
101
What is central tendency?
the tendency for the values of a random variable to cluster round its mean, mode, or median.
102
Diagram of normal kurotsis, positive excess kurotsis (leptokurtic) and negative excess kurotsis (platykurtic)
103
What does lepto mean?
prefix meaning thin
104
What is platy
a prefix meaning flat or wide (think Plateau)
105
Tests of normality (2)
Kolmogorov-Smirnov test Shapiro-Wilks test
106
Tests of normality is dependent on
sample size
107
If you got a massive sample size then you will find these normality tests often come out as .... even when your data visually can look - (2)
significant normally disttibuted
108
If you got a small sample size, then the normality tests may look non-siginificant, even when data is normally distributed, due to
lack of power in the test to detect a significant effect
109
There is no hard or fast rule for
determining whether data is normally distributed or not
110
Plot your data because this helps inform on what decisions you want to make with respect to
normality
111
Even if normality test is sig and data looks visually normally distributed then still do
parametric tests
112
A frequency distribution or a histogram is a plot of how many times
each score occurs
113
2 main ways a distribution can deviate from the normal - (2)
1. Lack of symmetry (called skew) 2. Pointyness (called kurotsis)
114
In a normal distribution the values of skew and kurtosis are 0 meaning...
tails of the distribution are as they should be
115
Is age nominal or continous?
Continous
116
Is gender continous or nominal?
Nominal
117
Is height continous or nominal?
Continous
118
Which of the following best describes a confounding variable? A. A variable that affects the outcome beingmeasured as well as, or instead of, theindependent variable B. A variable that is manipulated by theexperimenter C. A variable that has been measured using an unreliable scale D.A variable that is made up only of categories
A
119
If a test is valid , what does it mean? A.The test measures what it claims to measure. B. The test will give consistent results. (Reliability) C.The test has internal consistency (measure for correlations between different items on same test = see if it measures same construct) D. Test measures a useful construct or variable = test can measure something useful but not valid
A
120
A variable that measures the effect that manipulating another variable has is known as: A. DV B. A confounding variable C. Predictor variable D. IV
A
121
The discrepancy between the numbers used to represent something that we are trying tomeasure and the actual value of what we are measuring is called: A. Measurement error B. Reliability C. The 'fit' of the model D. Variance
A
122
A frequency distribution in which low scores are most frequent (i.e. bars on the graph arehighest on the left hand side) is said to be: A. Positively skewed B. Leptokurtic = distribution with positive kurotsis C. Platykurtic = negative kruotsis D. Negatively skewed = frequent scores
A
123
Which of the following is designed to compensate for practice effects? A. Counterbalancing B. Repeated measures design = practice effects issue in repeated measures C. Giving a participants a break between tasks = this compenstates for bordeom effects D. A control condition = provides reference point
A
124
Variation due to variables that has not been measured is A. Unsystematic variation B. Homogenous variance = assumption variance each population is equal C. Systematic variation = due to exp manpulation D. Residual variance = confirms how well regression line constructed fit to actual data
A
125
Purpose of control condition is to A. Allow inferences about cause B. Control for participants' characteristics = randomisation C. Show up relationship between predictor variables D. Rule out tertium quid
A Allow inferences of cause
126
If the scores on a test have a mean of 26 and a standard deviation of 4, what is the z-score for a score of 18? A. -2 B. 11 C. 2 D. -1.41
A (18-26) = -8/4 = -2
127
The standard deviation is the square root of the A. Variance B. Coefficient of determination = r squared C. Sum of squares = sum of squared deviances D. Range = largest = smallest
A
128
Complete the following sentence:A large standard deviation (relative to the value of the mean itself A. Indicate data points are distant from man (i.e., poor fit of data) B. Indicate the data points are close to mean C. Indicate that mean is good fit of data D. Indicate that you should analyse data with parameteric
A
129
The probability is p = 0.80 that a patient with a certain disease will besuccessfully treated with a new medical treatment. Suppose that thetreatment is used on 40 patients. What is the "expected value" of thenumber of patients who are successfully treated? A. 32 B. 20 C. 8 D. 40
A = 80% of 40 is 32 (0.80 * 40)
130
Imagine a test for a certain disease. Suppose the probability of a positive test result is .95if someone has the disease, but the probability is only .08 that someone has the disease if his or her test result was positive. A patient receives a positive test, and the doctor tellshim that he is very likely to have the disease. The doctor's response is: A. confusion of intervse B. Law of small numbers C. Gambler's fallacy D. Correct, because test is 95% accurate when someone has the disease = incorrect as doctor based assumption on incorrect inverse proability
A
131
Which of these variables would be considered not to have met the assumptions ofparametric tests based on the normal distribution? (Hint many statistical tests rely on data measured on interval level) A. gender B. Reaction time C. Temp D. Heart rate
A
132
The test statistics we use to assess a linear model are usually _______ based on thenormal distribution (Hint: These tests are used when all of the assumptions of a normal distribution havebeen met A. Parametric B. Non-parametric C. Robust D. Not
A
133
Which of the following is not an assumption of the general linear model? A. Dependence B. Addictivity C. Linearity D. Normally distributed residuals
A = independence is an assumption of parametric and not dependence
134
Looking at the table below, which of the following statements is the most accurate? Hint: The further the values of skewness and kurtosis are from zero, the more likely it is that thedata are not normally distributed A. For the number of hours spent practicsing , there is not an issue of kruotsis B. For level of msucial skill, data are heavily negatively skewed C. For number of hours spent practicsing there is an issue of kruotsis D. For the number of hours spent practicsing, the data is fairly positively skewed
A - correct B. Incorrect as value of skewnessis –0.079, which suggests that the dataare only very slightly negatively skewedbecause the value is close to zero C. Incorrect as value of kurtosis is0.098, which is fairly close to zero,suggesting that kurtosis was not aproblem for these data D. Incorrect as value of skewnessfor the number of hours spent practisingis –0.322, suggesting that the data areonly slightly negatively skewed
135
Diagram of skewness
136
In SPSS, output if value of skewness is between -1 and 1 then
all good
137
In SPSS, output if value is below -1 or above 1 then
data is skewed
138
In SPSS, output if value of skewness is below -1 then
negatively skewed
139
In SPSS, output if value of skewness is above 1 then
positively skewed
140
Diagram of lepto kurotic, platykurtic and mesokurtic( normal)
141
What does kurotsis tell you?
how much our data lies around the ends/tails of our histogram which helps us to identify when outliers may be present in the data.
142
A distribution with positive kurtosis, so much of the data is in the tails, will be
pointy or leptokurtic
143
A distribution with negative kurtosis, so the data lies more in the middle, will be more
be more sloped or platykurtic
144
Kurtosis is the sharpness of the
peak of a frequency-distribution curve
145
If our Kurtosis value is 0, then the result is a
normal distribution
146
If kurotsis value in SPSS between -2 and 2 then
all good! = normal distribution
147
If kurotsis value in SPSS less than -2 then
platykurtic
148
If kurotsis value is greater than 2 in SPSS then
leptokurtic
149
Are we good for skewness and kurotsis in this output SPSS?
Good because both the skewness is between -1 and 1 and kurtosis values are between -2 and 2.
150
Are we good for skewness and kurotsis in this output SPSS?A
Bad because although the skewness is between 1 and -1, we have a problem with kurtosis with a value of 2.68 which is larger than 2 and -2
151
Correlational research doesn’t allow to rule out the presence of a
third variable = confounding variable e.g, we find that drownings and ice cream sales are correlated, we conclude that ice cream sales cause drowning. Are we correct? Maybe due to the weather
152
The tertium quid is a variable that you may not have considered that could be
influencing your results e.g., ice cream and drowning session
153
How to rule out tertium quid? - (2)
Use of RCTs. Randomized Controlled Trials allow to even out the confounding variables between the groups
154
Correlation does not mean
causation
155
To infer causation,
we need to actively manipulate the variable we are interested in, and control against a group (condition) where this variable was not manipulated.
156
Correlation does not mean causation as according to Andy
causality between two variables cannot be assumed because there may be other measured or unmeasured variables affecting the results”
157
Aside from checking of kurotsis and skewness assumptions in data also check if it has
linearity or less commonly additivity
158
Additivity refrs to the combined
effect of many predictors
159
What does this diagram show in terms of additivty /linearity? - (5)
There is a a linear effect when the data increases at a steady rate like the graph on the left. Your cost increases steadily as the number of chocolate bars increases. The graph on the right shows a non-linear effect when there is not this steady increase rather there is a sharp change in your data. So you might feel ok if you eat a few chocolate bars but after that the risk of you having a stomach ache increases quite rapidly the more chocolates you eat. This effect is super important to check or your statistical analysis will be wrong even if your other assumptions are correct because a lot of statistical tests are based on linear models.
160
Discrepnacy between measurement and actual value in population is .. and not..
measurement error and NOT variance
161
Measurement error can happen across all psychological experiments from.. to ..
recording instrument failure to human error
162
What are the 2 types of measurement errors? - (2)
1. Systematic 2. Random
163
What is systematic measurement error?
: predictable, typically constant or proportional to the true value and always affect the results of an experiment in a predictable direction
164
Example of systematic measurement error
for example if I know I am 5ft2 and when I go to get measured I’m told I’m 6ft this is a systematic error and pretty identifiable - these usually happen when there is a problem with your experiment
165
What is random measurement error?
measurable values being inconsistent when repeated measures of a constant attribute or quantity are taken.
166
Example of random measurement error
for example my height is 5ft2 when I measure it in the morning but its 5ft when I measure myself in the evening. This is because my measurements were taken at different times so there would be some variability – for those of you who believe you shrink throughout the day.
167
What is variance?
Average squared deviation of each number from its mean.
168
Variability is an inherent part of
things being measured and of the measurement process
169
Diagram of variance formula
170
In central limit theorem - (2)
states that the sampling distribution of the mean approaches a normal distribution, as the sample size increases. This fact holds especially true for sample sizes over 30. Therefore, as a sample size increases, the sample mean and standard deviation will be closer in value to the population mean μ and standard deviation σ .
171
What does histogram look at? - (2)
Frequency of scores Look at distribution of data, skewness, kurotsis
172
What does boxplot look at? - (2)
To identify outliers Shows median rather than mean (good for non-normally distributed data)
173
What do line graphs are?
simply bar charts with lines instead of bars
174
Bar charts are a good way to display
display means (and standard errors)
175
What do scatterplot illustrates? - (2)
a relationship between two variables, e.g. correlation or regression Only use regression lines for regressions!
176
What are matrix scatterplots? - (2)
Particular kind of scatterplot that can be used instead of the 3-D scatterplot clearer to read
177
Using data provided how would you summarise skew? A. The data has an issue with positive skew B.The data has an issue with negative skew C.The data is normally distributed
B
178
What is the median number of bullets shot at a partner by females?
67.00
179
What descriptive statistics does the red arrow represents? A. Inter quartile range B. Median C. Mean D. Range
A
180
What is the mean of males and females SD? - (2)
Males M = 27.29 Females SD = 12.20
181
What is the respective standard error of mean for femals and males?
3.26 & 3.42
182
Answering the question: Meets assumption of parametric tests will determine whether our continous data can be tested with
with parametric or non-parametric tests
183
A normal distribution is a distribution with the same general shape which is a
bell shape
184
A normal distribution curve is symmetric around
the mean μ
185
A normal distribution is defined by two parameters - (2)
the mean (μ) and the standard deviation (σ).
186
Many statistical tests (parametric) cannot be used if the data is not
normally distributed
187
What does this diagram show? - (2)
μ = 0 is peak of distribution Block areas under the curve and gives us insight to way data is distributed and certain scores occuring if they belong to normally distribution e.g., 34.1% of values lie one SD below mean
188
A z score in standard normal distribution will reflect the number of
SD above or below the mean of a particular score is
189
How to calculate a z score?
Take a value of participant (e.g., 56 years old) and take away mean of distribution (e.g., mean age of class is 23) divided by SD (class like 2)
190
If a person scored a 70 on a test with a mean of 50 and a standard deviation of 10 Converting the test scores to z scores, an X of 70 would be... What the result means.... - (2)
a z score of 2 means the original score was 2 standard deviations above the mean
191
We can convert our z scores to
pecentiles
192
Example: What is the percentile rank of a person receving a score of 90 on the test? - (3) Mean - 80 SD = 5
First calculating z score: graph shows that most people scored below 90. Since 90 is 2 standard deviations above the mean z = (90 - 80)/5 = 2 Z score to pecentile can be looked at table that z score of 2 is equivalent to the 97.7th percentle: The proportion of people scoring below 90 is thus .977 and proportion of people scoring above 90 is 2.3% (1-0.977)
193
What is the sample mean?
an unbiased estimate of the population mean.
194
How can we know how that our sample mean estimate is representative of the population mean?
Via computing standard error of mean - smaller SEM the better
195
Standard deviation is used as a measure of how
representative the mean was of the observed data.
196
Small standard deviations represented a scenario in which most data points were
most data points were close to the mean
197
Large standard deviation represented a situation in which data points were
widely spread from the mean.
198
How to calculate the standard error of mean?
computed by dividing the standard deviation of the sample by the the square root of the number in the sample
199
The larger the sample the smaller the - (2)
standard error of the mean more confident we can be that the sample mean is representative of the population.
200
The central limit therom proposes that
as samples get large (usually defined as greater than 30), the sampling distribution has a normal distribution with a mean equal to the population mean, SD = SEM
201
The standard deviation of sample means is known as the
SEM (standard error of the mean)
202
A different approach to assess accuracy of sample mean as estimate of - population mean, aside from SE, is to - (2)
calculate boundaries and range of values within which we believe the true value of the population mean value will fall. Such boundaries are called confidence intervals.
203
Confidence intervals are created by
samples
204
A 95% confidence intervals is consructed such that
these intervals (created by samples) will contain the population mean
205
95% Confidence interval for 100 samples (CI constructed for each) would mean
95 of these samples, the confidence intervals we constructed would contain the true value of the mean in the population.
206
Diagram shows- (4)
* Dots show the means for each sample * Lines sticking out representing Ci for the sample means * If there was a vertical line down it represents population mean * If confidence intervals don't overlap then it shows significant difference between the sample means
207
In fact, for a specific confidence interval, the probability that it contains the population value is either - (2)
0 (it does not contain it) or 1 (it does contain it). You have no way of knowing which it is.
208
if our sample means were normally distributed with a mean of 0 and a standard error of 1, then the limits of our confidence interval
would be –1.96 and +1.96 -
209
95% of z scores fall between
-1.96 and 1.96
210
Confidence intervals can be constructed for any estimated parameter, not just
μ - mean
211
. If the mean represents the true mean well, then the confidence interval of that mean should be
small
212
if the confidence interval is very wide then the sample mean could be
very different from the true mean, indicating that it is a bad representation of the population
213
Remember that the standard error of the mean gets smaller with the number of observations and thus our confidence interval also gets
smaller - make sense as more we measure more certain sample mean close to population mean
214
Calculating Confidence Intervals for sample means - rearranging in z formula
LB = Mean - (1.96 * SEM) UB = Mean + (1.96 * SEM)
215
The standard deviation of SAT verbal scores in a school system is known to be 100. A researcher wishes to estimate the mean SAT score and compute a 95% confidence interval from a random sample of 10 scores. The 10 scores are: 320, 380, 400, 420, 500, 520, 600, 660, 720, and 780. Calculate CI
*** M - 530 * N = 10 * SEM = 100/ square root of 10 = 31.62** * Value of z for 95% CI is number of SD one must go from mean (in both directions) to contain 0.95 of the scores * Value of 1.96 was found in z-table * Since each tail is to contain 0.025 of the scores, you find the values of z for which is 1-0.025 = 0.975 of the socres below * 95% of z scores lie between -1.96 and +1.96 *** Lower limit = 530 - (1.96) (31.62) = 468.02 * Upper limit = 530 + (1.96)(31.62) = 591.98**
216
Think of test statistic capturing
signal/noise
217
# Hypo A testStatistic for which the frequency of particular values is known (t, F, chi-square) and thus we can calculate the
probability of obtaining a certain value or p value.
218
To test whether the model fits the data or whether our hypothesis is a good explanation of the data, we compare
systematic variation against unsystematic
219
If the probability (p-value) less than or equal to the significance level, then
the null hypothesis is rejected; When the null hypothesis is rejected, the outcome is said to be “statistically significant”
220
If the proabilibty (p-value) is greater than the signifiance leve, the
null hypothesis is not rejected.
221
What is a type 1 error in terms of variance? - (2)
think the variance accounted for by the model is larger than the one unaccounted for by the model (i.e. there is a statistically significant effect but in reality there isn’t)
222
Type 1 is a false
positive
223
What is type II error in temrs of variance?
think there was too much variance unaccounted for by the model (i.e. there is no statistically significant effect but in reality there is)
224
Type II error is false
negative
225
Example of Type I and Type II error
226
Type I and Type II errors are mistakes we can make when testing the
fit of the model
227
Type 1 errors when we believe there is a geniue effect in
population, when in fact there isn’t.
228
Acceptable level of type I error is usually
a-level of usually 0.05
229
Type II error occurs when we believe there is no effect in the
population when, in reality, there is.
230
Acceptable level of Type II error is probability/-p-value is
β-level (often 0.2)
231
An effect size is a standardised measure of
the size of the an effect
232
Properities of effect size (3)
Standardized = comparable across studies Not (as) reliant on the sample size Allows people to objectively evaluate the size of observed effect.
233
# Effect Size Measures r = 0.1, d = 0.2 (small effect):
the effect explains 1% of the total variance.
234
# Effect size measures r = 0.3, d = 0.5 (medium effect) means
the effect accounts for 9% of the total variance.
235
# Effect size measures r = 0.5, d = 0.8 (large effect)
effect accounts for 25% of the variance
236
Beware of the 'canned' effect sizes (e.g., r = 0.5, d = 0.8 and rest) since the size of
effect should be placed within the research context.
237
We should aim to achieve a power of
.8, or an 80% chance of detecting an effect if one genuinely exists.
238
When we fail to reject the null hypothesis, it is either that there truly are no difference to be found, OR
it may be because we do not have enough statistical power
239
Power is the probability of
correctly rejecting a false H0 OR the ability of the test to find an effect assuming there is one in the population,
240
Power is calculated by
1 - β OR probability of making Type II error
241
To increase statistical power of study you can increase
your sample sizee
242
Factors affecting the power of the test: (4):
1. Probability of a type 1 error or a-level [level at which we decide effect is sig - p-value) --> bigger [more lenient] alpha then more power) 2. True alternate hypothesis H1 [effect size] (degree of overlap, less means more power) - if you find large effect in lit then better chance of detecting something 3. The sampel size [N]) --> bigger the sample, less the noise and more power 4. The particular tests to be employed - parametric tests greater power to detect sig effect since more sensitive
243
How to calculate the number of pps they need for reasonable chance of correctly rejecting null hypothesis?
Sample size calculation at a desired level of power (usually power set to 0.8 in formula)
244
With power, we can do 2 things - (2)
* Calculate power of test * Calculate sample size necessary to detect an decent effect size and achieve a certain level of power based on past research
245
Diagram of Type I error, Type II error, power - (4) and making correct decisions
Type 1 error p = alpha Type II error p = beta Accepting null hypothesis which is correct - p = 1- alpha Accepting alternate hypo which is correct - p = 1 - beta
246
If there is a less degree of overlap in h0 and h1 then
bigger difference means higher power and and correctly reject the null hypothesis than distributions that overlap more
247
If distribution between h0 and h1 are narrower then
This means that the overlap in distributions is smaller and the power is therefore greater, but this time because of a smaller standard error of our estimate of the means.
248
Most people want to assess how many participants they need to test to have a reasonable chance of correctly rejecting the null hypothesis (the Power). This formula shows - (2)
us how. We usually set the power to 0.8.
249
What is z scores? - (2)
A measure of variability: The number of standard deviations from the population mean or a particular data point is Z-scores are a standardised measure, hence they ignore measurement units
250
Why should we care about z scores? - (2)
Z-scores allow researchers to calculate the probability of a score occurring within a standard normal distribution Enables us to compare two scores that are from different samples (which may have different means and standard deviations)
251
Diagram of finding percentile of Trish Trish takes a test and gets 25 Mean of the class is 20 SD = 4 25-20/4 = 1.25 Z-score = 1.25
Let’s say Trish takes a test and scores 25 and the mean is 20 You may calculate the z-score to be 1.25 you would use a z-score table to see what percentile they would be in (marked in red) so to read the table you would go down to the value 1.2 and you would go across to 0.05 which totals to 1.25 and you can see about 89.4% of other students performed worse.
252
Diagram of z score and percentile Josh takes a different test and gets 1150 Mean of the class is 1000 SD = 150 1150 – 1000/150 = 1.0 Z score = 1.0 Who performed better Trish or Josh? Trish had z score of 1.25
We would use our table and look down the column to a z-score of 1 and across to the 0.00 column (in purple) and we can see 84.1% of students performed worse than Josh so Trish performed better than Josh.
253
Diagram of z scores and normal distribution - (3)
68% of scores are within 1 SD of the mean, 95% are within 2 SDs and 99.7% are within 3 SDs.
254
Whats standard error?
: by taking into account the variability and size of our sample we can estimate how far away from the real population mean our mean is!
255
If we took infinite samples from the population, 95% of the time the population mean will lie within the
the 95% confidence interval range
256
What does narrow CI represent?
high statistical power
257
Wide CIs represent?
low statistical power
258
Power bring the probability of catching a real effect (as opposed to
missing a real effect – Type II error)
259
We can never say the null hypothesis is
FALSE (or TRUE).
260
The P value or calculated probability is the estimated probability of us
us finding an effect when the null hypothesis (H0) is true.
261
p = probability of observing a test statistic at least as a big as the one we have if the
H0 is true
262
Hence, a significant p value (p <.05) tells us that there is a less than 5% chance of getting a test statistic that is
larger than the one we have found if there were no effect in the population (e.g. the null hypothesis were true)
263
Statistical signifiance does not equal importance - (2)
p = .049, p = .050 are essentially the same thing- the former is ‘statistically significant’. Importance is dependent upon the experimental design/aims: e.g., A statistically significant weight increase of 0.1Kg between two adults experimental groups may be less important than the same increase between two groups of babies.
264
Children can learn a second language faster before the age of 7’. Is this statement: A. One-tailed B. A non scientific C. Two-tailed D. Null hypothesos
A as one-tailed is directional and two tailed is non-direcitonal
265
Which of the following is true about a 95% confidence interval of the mean: A. 95 out of 100 CIs wll contain population mean B. 95 out of 100 sample means will fall within the limits of the confidence interval. C. 95% of population means will fall within the limits of the confidence interval. D. There is a 0.05 probability that the population mean falls within the limits of the confidence interval.
A as If we’d collected 100 samples, calculated the mean and then calculated a confidence interval for that mean, then for 95 of these samples the confidence intervals we constructed would contain the true value of the mean in the population
266
What does a significant test statistic tell us? A. That the test statistic is larger than we would expect if there were no effect in the population. B. There is an important effect. C. The null hypothesis is false. D. All of the above.
A and just because test statistic is sig does not mean its important effect
267
Of what is p the probability? (Hint: NHST relies on fitting a ‘model’ to the data and then evaluating the probability of this ‘model’ given the assumption that no effect exists.) A.p is the probability of observing a test statistic at least as big as the one we have if there were no effect in the population (i.e., the null hypothesis were true). B. p is the probability that the results are due to chance, the probability that the null hypothesis (H0) is true. C. p is the probability that the results are not due to chance, the probability that the null hypothesis (H0) is false D. p is the probability that the results would be replicated if the experiment was conducted a second time.
A
268
A Type I error occurs when: (Hint: When we use test statistics to tell us about the true state of the world, we’re trying to see whether there is an effect in our population.) A. We conclude that there is an effect in the population when in fact there is not. B. We conclude that there is not an effect in the population when in fact there is. C. We conclude that the test statistic is significant when in fact it is not. D. The data we have typed into SPSS is different from the data collected.
A as If we use the conventional criterion then the probability of this error is .05 (or 5%) when there is no effect in the population
269
True or false? a. Power is the ability of a test to detect an effect given that an effect of a certain size exists in a population.
TRUE
270
True or False? We can use power to determine how large a sample is required to detect an effect of a certain size.
TRUE
271
True or False? c. Power is linked to the probability of making a Type II error.
TRUE
272
True or False? d. The power of a test is the probability that a given test is reliable and valid.
FALSE
273
What is the relationship between sample size and the standard error of the mean? (Hint: The law of large numbers applies here: the larger the sample is, the better it will reflect that particular population.) A. The standard error decreases as the sample size increases. B. The standard error decreases as the sample size decreases. C. The standard error is unaffected by the sample size. D. The standard error increases as the sample size increases.
A The standard error (which is the standard deviation of the distribution of sample means), defined as σ_Χ ̅ =σ/√N, decreases as the sample size (N) increases and vice versa
274
What is the null hypothesis for the following question: Is there a relationship between heart rate and the number of cups of coffee drunk within the last 4 hours? A. There will be no relationship between heart rate and the number of cups of coffee drunk within the last 4 hours. B. People who drink more coffee will have significantly higher heart rates. C. People who drink more cups of coffee will have significantly lower heart rates. D. There will be a significant relationship between the number of cups of coffee drunk within the last 4 hours and heart rate
A The null hypothesis is the opposite of the alternative hypothesis and so usually states that an effect is absent
275
A Type II error occurs when : (Hint: This would occur when we obtain a small test statistic (perhaps because there is a lot of natural variation between our samples.) A. We conclude that there is not an effect in the population when in fact there is. B. We conclude that there is an effect in the population when in fact there is not. C. We conclude that the test statistic is significant when in fact it is not. D. The data we have typed into SPSS is different from the data collected.
A A Type II error would occur when we obtain a small test statistic (perhaps because there is a lot of natural variation between our samples)
276
In general, as the sample size (N) increases: A. The confidence interval gets narrower. B. The confidence interval gets wider. C. The confidence interval is unaffected. D. The confidence interval becomes less accurate
A
277
Which of the following best describes the relationship between sample size and significance testing? (Hint: Remember that test statistics are basically a signal-to-noise ratio, so given that large samples have less ‘noise’ they make it easier to find the ‘signal’.) A. In large samples even small effects can be deemed ‘significant’. B. In small samples only small effects will be deemed ‘significant’. C. Large effects tend to be significant only in small samples. D. Large effects tend to be significant only in large samples.
A
278
The assumption of homogeneity of variance is met when: A. The variances in different groups are approximately equal. B. The variances in different groups are significantly different. C. The variance across groups is proportional to the means of those groups. D. The variance is the same as the interquartile range.
A - To make sure our estimates of the parameters that define our model and significance tests are accurate we have to assume homoscedasticity (also known as homogeneity of variance)
279
Next, the lecturer was interested in seeing whether males and females reacted differently to the different teaching methods. Produce a clustered bar graph showing the mean scores of teaching method for males and females. (HINT: place TeachingMethod on the X axis, Exam Score on the Y axis, and Gender in the ‘Cluster on X’ box. Include 95% confidence intervals in the graph). Which of the following is the most accurate interpretation of the data? A.Females performed better than males both the reward and indifferent conditions. Regarding the confidence intervals, there was a large degree of overlap between males and females in all conditions of the teaching method. B.Males performed better than females in the reward condition, and females performed better than males in the indifferent condition. Regarding the confidence intervals, there was no overlap between males and females across any of the conditions of teaching method. C.Males performed better than females in all conditions. Regarding the confidence intervals, there was a small degree of overlap between males and females for the reward and indifferent conditions, and a large degree of overlap between males and females for the punish condition. D.Males performed better than females in the reward condition, and females performed better than males in the indifferent condition. Regarding the confidence intervals, there was a small degree of overlap between males and females for the reward and indifferent conditions, and a large degree of overlap between males and females for the punish condition.
D
280
Produce a line graph showing the change in mean anxiety scores over the three time points. NOTE: this is a repeated measures (or within subjects) design, ALL participants took part in the same condition. Which of the following is the correct interpretation of the data? A.Mean anxiety increased across the three time points. BMean anxiety scores were reduced across the three time points, and there was a slight acceleration in this reduction between the middle and end of the course. CMean anxiety scores were reduced across the three time points, though this reduction slowed down between the middle and end of the course. DMean anxiety scores did not change across the three time points.
B
281
A general approach in regression is that our outcomes can be predicted by a model and what remains
is the error
282
The i in the general model in regression shows
e.g., outcome 1 is equal to model plus error 1 and outcome 2 is equal to model plus error 2 and so on...
283
For correlation, the outcome is modelled by
scaling (multiplying by a constant) another variable
284
Equation of correlation model
285
If you have a 1 continous variable which meets assumtpion of parametric test then you can conduct a
pearson correlation or regression
286
Variance is a feature of outcome measurements we have obtained and we want to predict with a model in correlation/regression that...
captures the effect of the predictor variables we have manipulated or measured
287
Variance of a single variable represents the
average amount that the data cary from the mean
288
Variance is the standard deviation
squared (s squared)
289
Variance formula - (2)
xi minus average of all scores of pp which is squared and divided by total number of participants minus 1 done for each participant (sigma)
290
Variance is SD squared meaning that it captures the
average of the squared difference the outcome values from the mean of all outcomes (explaining what the formula of variance does)
291
Covariance gathers information on whether
one variable covarys with another
292
In covariance if we are interested whether 2 variables are related then interested whether changes in one variable are met with changes in other therefore.. - (2)
when one variable deviates from its mean we would expect the other variable to deviate from its mean in a similar way. So, if one variable increases then the other, related variable, should also increase or even decrease at the same level.
293
If one variable covaries with another variable then it means these 2 variables are
related
294
To get SD from variance then you would
square root variance
295
What would you do in covariance formula in proper words? - (5)
1. Calculate the error between the mean and each subject’s score for the first variable (x). 1. Calculate the error between the mean and their score for the second variable (y). 1. Multiply these error values. 1. Add these values and you get the product deviations. 1. The covariance is the average product deviations
296
Example of calculaitng covariance and what does answer tell you?
The answer is positive: that tells us the x and y values tend to rise together.
297
What does each element of covariance formula stand for? - (5)
X = the value of ‘x’ variable Y = the value of ‘y’ variable X(line) = mean of ‘x’ - e.g., green Y(line) = mean of ‘y’ - e.g., blue n = the number of items in the data set
298
covariance will be large when values below
the mean for one variable
299
What does a positive covariance indicate?
as one variable deviates from the mean, the other variable deviates in the same direction.
300
What does negative covariance indicate?
a negative covariance indicates that as one variable deviates from the mean (e.g. increases), the other deviates from the mean in the opposite direction (e.g. decreases).
301
What is the problem of covariance as a measure of the relationship between 2 variables? - (5)
dependent upon the units /scales of measurement used So covariance is not a standardised measure e.g., if 2 variables measured in miles and covariance is 4.25 then if we convert data to kilometres then we have to calculate covariance again and see it increases to 11. Dependence of scale measurement is a problem as can not compare covariances in an objective way --> can not say whether covariance is large or small to another data unless both data sets measured in same units So we need to STANDARDISE it.
302
What is the process of standardisaiton?
To overcome the problem of dependence on the measurement scale, we need to convert the covariance into a standard set of units
303
How to standardise the covariance?
dividing by product of the standard deviations of both variables.
304
Formula of standardising covariance
Same formula of covariance but multipled of SD of x and SD of y
305
Formula of Pearson's correlation coefficient, r
306
Example of calculating Pearson's correlation coefficient, r - (5)
standard deviation for the number of adverts watched (sx) was 1.67, SD of number of packets of crisps bought (sy) was 2.92. If we multiply these together we get 1.67 × 2.92 = 4.88. .Now, all we need to do is take the covariance, which we calculated a few pages ago as being 4.25, and divide by these multiplied standard deviations. This gives us r = 4.25/ 4.88 = .87.
307
The standardised version of covariance is the
correlational coefficient or Pearson's r
308
Pearson's R is ... version of covariance meaning independent of units of measurement
standardised
309
What does correlation describe? - (2)
Describes a relationship between variables If one variable increases, what happens to the other variable?
310
Pearson's correlation coefficient r was also called the
product-moment correlation
311
Linear relationship and normally disturbed data and interval/ratio and continous data is assumed in
Pearson's r correlation coefficient
312
Pearson Correlation Coefficient varies between
-1 and +1 (direction of relationship)
313
The larger the R Pearson's correlation coefficient value, the closer the values will
be with each other and the mean
314
The smaller R Pearson's correlation coefficient values indicate
there is unexplained variance in the data and results in the data points being more spread out.
315
What does these two graphs show? - (2)
* example of high negative correlation. The data points are close together and are close to the mean. * On the other hand, the graph on the right shows a low positive correlation. The data points are more spread out and deviate more from the mean.
316
The Pearson Correlation Coefficient measures the strength of a relationhip
between one variable and another hence its use in calculating effect size
317
A Pearson's correlation coefficient of +1 indicates
two variablesare perfectly positively correlated, so as one variable increases, the other increases by a proportionate amount.
318
A Pearson's correlation coefficient of -1 indicates
a perfect negative relationship: if one variable increases, the other decreases by a proportionate amount.
319
Pearson's r +/- 0.1 means
small effect
320
Pearson's r +/- 0.3 means
medium effect
321
Pearson's r +/- 0.5 means
large effect
322
In Pearson's correlation, we can test the hypothesis that - (2)
correlation coefficient is different from zero (i.e., different from 'no relationship')
323
In Pearson's correlation coefficient, we can test the hypothesis that the correlation is different from 0 If we find our observed coefficient was very unlikely to happen if there was no effect in population then gain confidence that
relationship that we have observed is statistically meaningful.
324
. In the case of a correlation coefficient we can test the hypothesis that the correlation is different from zero (i.e. different from ‘no relationship’). There are 2 ways to test this hypothesis
1. Z scores 2. T-statistic
325
Confidence intervals tells us something
likely correlation in the population
326
Can calculate confidence intervals of Pearson's correlation coefficient by transforming formula of CI
327
As sample size increases, so the value of r at which a significant result occurs
decreases e.g 20 n p is not < 0.05 but at 200 pps it is p < 0.05
328
Pearson's r = 0 means - (2)
indicates no linear relationship at all so if one variable changes, the other stays the same.
329
Correlation coefficients give no indication of direction of... + example - (2)
causality e.g., although we conclude no of adverts increase nmber of toffees bought we can't say watching adverts caused us to buy toffees
330
We have to be caution of causality in terms of Pearson's correlation r as - (2)
* Third variable problem - causality between variables can not be assumed in any correlation * Direction of causality: Correlation coefficients give nothing about which variables causes other to change.
331
If you got weak correlation between 2 variables = weak effect then take a lot of measurements for that relationship to be
significant
332
R correlation coefficient gives the ratio of
covariance to a measure of variance
333
Example of correlations getting stronger
334
R squared is known as the
coefficient of determination
335
# of cor R^2 can be used to explain the
proportion of the variance for a dependent variable )outcome) that's explained by an independent variable . (predictor)
336
Example of R^2 coefficient of determination - (2) X = exam anxiety Y = exam performance If R^2 = 0.194
19.4% of variability in exam performance can be explained by exam anxiety the variance in y accounted for by x’,
337
R^2 calculate the amount of shared
variance
338
Example of r and R^2
Multiply 0.1 * 0.1 for example
339
R^2 gives you the true strength of.. but without
the correlation but without an indication of its direction.
340
What are the three types of correlations? - (3)
1. Bivarate correlations 2. Partial correlations 3. Semi-partial or part correlations
341
Whats bivarate correlation?
relation between 2 variables
342
What is a partial correlation?
looks at the relationship between two variables while ‘controlling’ the effect of one or more additional variables.
343
The partial correlation partials out the
the effect of one or more variables on either X or Y
344
A partial correlation controls for third variable which is made from - (3)
* A correlation calculates each data points distance from line (residuals) * This is the error relative to the model (unexplained variance) * A third variable might predict some of that variation in residuals
345
The partial correlation compares the unique variation of one variable with the
unfiltiered variation of the other
346
The partial correlation holds the
third variable constant (but we don't manipulate these)
347
Example of partial correlation- (2)
For example, when studying the effect of a diet, the level of exercise might also influence weight loss We want to know the unique effect of diet, so we need to partial out the effect of exercise
348
Example of Venn Diagram of Partial Correlation - (2)
Partial Correlation between IV1 and DV = D / D+C Unique variance accounted for by the predictor (IV1) in the DV, after accounting for variance shared with other variables.
349
Example of Partial Correlation - (2)
Partial correlation: Purple / Red + Purple If we were doing just a partial correlation, we would see how much exam anxiety is influencing both exam performance and revision time.
350
Example of partial correlation and semi-partial correlation - (2)
The partial correlation that we calculated took account not only of the effect of revision on exam performance, but also of the effect of revision on anxiety. If we were to calculate the semi-partial correlation for the same data, then this would control for only the effect of revision on exam performance (the effect of revision on exam anxiety is ignored).
351
In partial correlation, the third variable is typically not considered as the primary independent or dependent variable. Instead, it functions as a
control variable—a variable whose influence is statistically removed or controlled for when examining the relationship between the two primary variables (IV and DV).
352
The partial correlation is The amount of variance the variable explains
relative to the amount of variance in the outcome that is left to explain after the contribution of other predictors have been removed from both the predictor and outcome.
353
These partial correlations can be done when variables are dichotomous (including third variable) e.g., - (2)
we could look at the relationship between bladder relaxation (did the person wet themselves or not?) and the number of large tarantulas crawling up your leg controlling for fear of spiders (the first variable is dichotomous, but the second variable and ‘controlled for’ variables are continuous).
354
What does this partial correlation output show? Revision time = partial, controlling for its effect Exam performance = DV Exam anxiety = X - (5)
* . First, notice that the partial correlation between exam performance and exam anxiety is −.247, which is considerably less than the correlation when the effect of revision time is not controlled for (r = −.441). * . Although this correlation is still statistically significant (its p-value is still below .05), the relationship is diminished. * value of R2 for the partial correlation is .06, which means that exam anxiety can now account for only 6% of the variance in exam performance. * When the effects of revision time were not controlled for, exam anxiety shared 19.4% of the variation in exam scores and so the inclusion of revision time has severely diminished the amount of variation in exam scores shared by anxiety. * As such, a truer measure of the role of exam anxiety has been obtained.
355
Partial correlations are most useful for looking at the unique relationship between two variables when
other variables are ruled out
356
In a semi-partial correlation we control for the
effect that the third variable has on only one of the variables in the correlation
357
The semi partial (part) correlation partials out the - (2)
Partials out the effect of one or more variables on either X or Y. e.g. The amount revision explains exam performance after the contribution of anxiety has been removed from the one variable (usually the predictor- e.g. revision).
358
The semi-partial correlation compares the
unique variation of one variable with the unfiltered variation of the other.
359
Diagram of venn diagram of semi-partial correlation - (2)
* Semi-Partial Correlation between IV1 and DV = D / D+C+F+G Unique variance accounted for by the predictor (IV1) in the DV, after accounting for variance shared with other variables.
360
Diagram of revision and exam performance and revision time on semi-partial correlation - (2)
* purple/red + purple + white+ orange * When we use semi-partial correlation to look at this relationship, we partial out the variance accounted for by exam anxiety (the orange bit) and look for the variance explained by revision time (the purple bit).
361
Summary of partial correlation and semi-partial correlation - (2)
A partial correlation quantifies the relationship between two variables while accounting for the effects of a third variable on both variables in the original correlation. A semi-partial correlation quantifies the relationship between two variables while accounting for the effects of a third variable on only one of the variables in the original correlation.
362
Pearson’s product-moment correlation coefficient (described earlier) and Spearman’s rho (see section 6.5.3) are examples of
of bivariate correlation coefficients.
363
Non-parametric tests of correlations are... (2)
* Spearman's roh * Kendall's tau test
364
In spearman's rho the variables are not normally distributed and measures are on a
ordinal scale (e.g., grades)
365
Spearman's rho works on by
first ranking the data n(numbers converted into ranks), and then running Pearson’s r on the ranked data
366
Spearman’s correlation coefficient, rs, is a non-parametric statistic and so can be used when the data have
data have violated parametric assumptions such as nonnormally distributed data
367
In spearman correlation coefficient is sometimes called
Spearman's rho
368
For spearman's r we can get R squared but it is interpreted slightly different as
proportion of variance in the ranks that two variables share.
369
Kendall's tau used rather than Spearman's coefficient when - (2)
when you have a small data set with a large number of tied ranks. This means that if you rank all of the scores and many scores have the same rank, then Kendall’s tau should be used
370
Kendall's tau test - (2)
For small datasets, many tied ranks Better estimate of correlation in population than Spearman’s ρ
371
Kendall's tau is not numerically similar to r or rs (spearman) and so tau squared does not tell us about
proportion of variance shared by two variables (or the ranks of those two variables).
372
The Kendall's tau is 66-75% smaller than both Spearman's r and Pearson's r so
tau is not comparable to r and r s
373
There is a benefit using Kendall's statistic than Spearman as it shows - (2)
Kendall’s statistic is actually a better estimate of the correlation in the population we can draw more accurate generalizations from Kendall’s statistic than from Spearman’s.
374
Whats the decision tree for Spearman's correlation? - (4)
* What type of measurement = continous * How many predictor variables = one * What type of continous variable = continous * Meets assumption of parametric tests - No
375
The output of Kendall and Spearman can be interpreted the same way as
Pearson's correlation coefficient r output box
376
The biserial and point-biserial correlation coefficients used when
one of the two variables is dichotomous (e.g., example of dichotomous variable is women being pregnant or not)
377
What is the difference between biserial and point-biserial correlations?
depends on whether the dichotomous variable is discrete or continuous
378
The point–biserial correlation coefficient (rpb) is used when
one variable is a discrete dichotomy (e.g. pregnancy),
379
biserial correlation coefficient (rb) is used when - (2)
one variable is a continuous dichotomy (e.g. passing or failing an exam). e.g. An example is passing or failing a statistics test: some people will only just fail while others will fail by a large margin; likewise some people will scrape a pass while others will clearly excel.
380
Example of when point=biserial correlation used - (3)
* Imagine interested in relationship between gender of a cat and how much time it spent away from home * Time spent away is measured in interval level --> mets assumptions of parametric data * Gender is discrete dichotomous variable coded with 0 for male and 1 for female
381
Can convert point-biserial correlation coefficient into
biseral correlation coefficient
382
Point biserial and biserial correlation differ in size as
biserial correlation bigger than point biserial
383
Example of queston conducting Pearson's r (4) -
The researchers was interested in whether the amount someone gets paid and amount of holidays they take from work, whether these two variables would be related to their productivity at work - Pay: Annual salary - Holiday: Number of holiday days taken - Productivity: Productivity rating out of 10
384
Example of Pearson's r scatterplot : relationship between pay and productivity
385
If we have r = 0.313 what effect size is it?
medium effect size ±.1 = small effect ±.3 = medium effect ±.5 = large effect
386
What does this scatterplot show?
o This indicates very little correlation between the 2 variables
387
What will a matrix scatterplot show?
the relationship between all possible combinations of your variables
388
What does this scatterplot matrix show? - (2)
- For Pay and Holiday, we can see the line is very flat and indicates the correlation between the two variables is quite low - - For pay and productivity, the line is steeper suggesting the correlation is fairly substantial between these 2 variables and same for holidays and pay and productivity and holidays here
389
What is degrees of freedom for correlational analysis?
N-2
390
What does this Pearson's correlation r output show? - (4)
* - The relationship between pay and holidays is very low correlation is -0.04 * - Between pay and productivity, there is a medium size correlation of r = 0.313 * Between holidays and productivity there is medium going on large effect size of 0.435 * Relationship between pay and productivity and also holidays and productivity is sig but correlation with pay and holidays was not sig
391
Another examp;e of Pearson's correlation r question - (3)
A student was interested in the relationship between the time spent preparing an essay, the interestingness of the essay topic and the essay mark received. He got 45 of his friends and asked them to rate, using a scale from 1 to 7, how interesting they thought the essay topic was (1 - I'll kill myself of boredom, 4 - it's not too bad!, 7 - it's the most interesting thing in the world!) (interesting). He then timed how long they spent writing the essay (hours), and got their percentage score on the essay (essay).
392
Example of interval/ratio continous data needed for Pearson's r for IV and DV - (2)
* Interval scale: difference between 10 degrees C and 20 degrees is same as 80 F and 90 F, 0 degrees does not mean absence of temp * Ratio: Height as 0 cm means no weight and weight, time
393
Pearson's correlation r , spearman and kendall equires
one IV and one DV
394
What does this SPSS output show? A. There was a non-significant positive correlation between interestingness of topic and the amount of time spent writing. There was a non-significant positive correlation between time spent writing an essay and essay mark There was a significant positive correlation between interestingness of topic and essay mark, with a medium effect size B. There was a significant positive correlation between interestingness of topic and the amount of time spent writing, with a small effect size.There was a significant positive correlation between time spent writing an essay and essay mark, with a large effect size. .There was a non-significant positive correlation between interestingness of topic and essay mark C. There was a significant negative correlation between interestingness of topic and the amount of time spent writing, with a medium effect size.. There was a non-significant positive correlation between time spent writing an essay and essay mark. There was a non-significant positive correlation between interestingness of topic and essay mark D. There was a significant positive correlation between interestingness of topic and the amount of time spent writing, with a large effect size. There was a non-significant positive correlation between time spent writing an essay and essay mark There was a non-significant positive correlation between interestingness of topic and essay mark
D. There was a significant positive correlation between interestingness of topic and the amount of time spent writing, with a large effect size. There was a non-significant positive correlation between time spent writing an essay and essay mark There was a non-significant positive correlation between interestingness of topic and essay mark
395
r = 0.21 effect size is..
in between small and medium effect
396
Effect size is only meaningful if you evaluatte it witth regards to
your own research area
397
Biserial correlaion is when
one variable is dichotomous, but there is an underlying continuum (e.g. pass/fail on an exam)
398
Pointt biserial correlation is when
When one variable is dichotomous, and it is a true dichotomy (e.g. pregnancy)
399
Example of dichotomous relationship - (3)
* example of a true dichotomous relationship. * We can compare the differences in height between males and females. * Use dichotomous predictor of gender
400
What is the decision tree for multiple regression? - (4)
* Continous * Two or more predictors that are continous * Multiple regression * Meets assumptions of parametric tests
401
Multiple regression is the same as simple linear regression expect for - (2)
every extra predictor you include, you have to add a coefficient; so, each predictor variable has its own coefficient, and the outcome variable is predicted from a combination of all the variables multiplied by their respective coefficients plus a residual term
402
Multiple regression equation
403
In multiple regression equation, list all the terms - (5)
* Y is the outcome variable, * b1 is the coefficient of the first predictor (X1), * b2 is the coefficient of the second predictor (X2), * bn is the coefficient of the nth predictor (Xn), * εi is the difference between the predicted and the observed value of Y for the ith participant.
404
Multiple regression uses the same principle as linear regression in a way that
we seek to find the linear combination of predictors that correlate maximally with the outcome variable.
405
Regression is a way of predicting things that you have not measured by predicting
an outcome variable from one or more predictor variables
406
Can't plot a 3D plot of MR as shown here
for more than 2 predictor (X) variables
407
If you got two prediictors thart overlap and correlate a lot then it is a .. model
bad model can't uniquely explain the outcome
408
In Hierarchical regression, we are seeing whether
one model explains significantly more variance than the other
409
In hierarchical regression predictors are selected based on
past work and the experimenter decides in which order to enter the predictors into the model
410
As a general rule for hierarchical regression, - (3)
known predictors (from other research) should be entered into the model first in order of their importance in predicting the outcome. After known predictors have been entered, the experimenter can add any new predictors into the model. New predictors can be entered either all in one go, in a stepwise manner, or hierarchically (such that the new predictor suspected to be the most important is entered first).
411
Example of hierarchical regression in terms of album sales - (2)
The first model allows all the shared variance between Ad budget and Album sales to be accounted for. The second model then only has the option to explain more variance by the unique contribution from the added predictor Plays on the radio.
412
What is forced entry MR?
method in which all predictors are forced into the model simultaneously.
413
Like HR, forced entry MR relies on
good theoretical reasons for including the chosen predictors,
414
Different from HR, forced entry MR
makes no decision about the order in which variables are entered.
415
Some researchers believe that about forced entry MR that
this method is the only appropriate method for theory testing because stepwise techniques are influenced by random variation in the data and so rarely give replicable results if the model is retested.
416
Why select colinearity diagnostics in statistics box for multiple regression? - (2)
This option is for obtaining collinearity statistics such as the VIF, tolerance, Checking assumption of no multicolinearity
417
Multicollinearity poses a problem only for multiple regression because
simple regression requires only one predictor.
418
Perfect collinearity exists in multiple regression when at least
e.g., two predictors are perfectly correlated , have a correlation coefficient of 1
419
If there is perfect collinearity in multiple regression between predictors it becomes impossible
to obtain unique estimates of the regression coefficients because there are an infinite number of combinations of coefficients that would work equally well.
420
Good news is perfect colinearity in multiple regression is rare in
real-life data
421
If two predictors are perfectly correlated in multiple regression then the values of b for each variable are
interchangable
422
As colinearity increases in multiple regression, there are 3 problems that arise - (3)
* Untrustory bs * Limit size of R * Importance of predictors
423
One way of identifying multicollinearity in multiple regression is to scan a
a correlation matrix of all of the predictor variables and see if any correlate very highly (by very highly I mean correlations of above .80 or .90)
424
The VIF indicates in multiple regression whether a
predictor has a strong linear relationship with the other predictor(s).
425
If VIF statistic above 10 or approaching 10 in multiple regression then what you would want to do is have a - (2)
look at your variables to see if you need to include all variables whether all need to go in model if high correlation between 2 predictors (measuring same thing) then decide whether its important to include both vars or take one out and simplify regression model
426
Related to the VIF in multiple regression is the tolerance statistic, which is its
reciporal (1/VIF) = inverse of VIF
427
In Plots in SPSS, you put in multiple regression - (2)
ZRESID on Y and ZPRED on X Plot of residuals against predicted to asses homoscedasticity
428
What is ZPRED in MR? - (2)
(the standardized predicted values of the dependent variable based on the model). These values are standardized forms of the values predicted by the model.
429
What is ZRESID in MR? - (2)
(the standardized residuals, or errors). These values are the standardized differences between the observed data and the values that the model predicts).
430
SPSS in multiple linear regression gives descriptive outcoems which is - (2)
* basics means and also a table of correlations between variables. * This is a first opportunity to determine whether there is high correlation between predictors, otherwise known as multi-collinearity
431
In model summary of SPSS, it captures how the model or models explain in MR
variance in terms of R squared, and more importantly how R squared changes between models and whether those changes are significant.
432
Diagram of model summary
433
What is the measure of R^2 in multiple regression
measure of how much of the variability in the outcome is accounted for by the predictors
434
The adjusted R^2 gives us an estimate of in multiple regression
fit in the general population
435
The Durbin-Watson statistic if specificed in multiple regresion tells us whether the - (2)
assumption of independent errors is tenable (value less than 1 or greater than 3 raise alarm bells) value closer to 2 the better = assumption met
436
SPSS output for MR = ANOVA table which performs
F-tests for each model
437
SPSS output for MR contains ANOVA that tests whether the model is
significantly beter at predicting the outcome than using the mean as a 'best guess'
438
The F-ratio represents the ratio of
improvement in prediction that results from fitting the model, relative to the inaccuracy that still exists in the model
439
We are told the sum of squares for model (SSM) - MR regression line in output which represents
improvement in prediction resulting from fitting a regression line to the data rather than using the mean as an estimate of the outcome
440
We are told residual sum of squares (Residual line) in this MR output which represents
total difference between the model and the observed data
441
DF for Sum of squares Model for MR regression line is equal to
number of predictors (e.g., 1 for first model, 3 for second)
442
DF for Sum of Squares Residual for MR is - (2)
Number of observations (N) minus number of coefficients in regression model (e.g., M1 has 2 coefficents - one for predictor and one for constant, M2 has 4 - one for each 3 predictor and one for constant)
443
The average sum of squares in ANOVA table is calculated by
calculated for each term (SSM, SSR) by dividing the SS by the df. T
444
How is the F ratio calculated in this ANOVA table?
F-ratio is calculated by dividing the average improvement in prediction by the model (MSM) by the average difference between the model and the observed data (MSR)
445
If the improvement due to fitting the regression model is much greater than the inaccuracy within the model then value of F will be
greater than 1 and SPSS calculates exact prob (p-value) of obtaining value of F by change
446
What happens if b values are positive in multiple regression?
there is a positive relationship between the predictor and the outcome,
447
What happens if the b value is negative in multiple regression?
represents a negative relationship between predictor and outcome variable?
448
What do the b values in this table tell us what relationships between predictor and outcome variable in multiple regression? (3)
Indicating positive relationships so as advertising budget increases, record sales increases (outcome) plays on ratio increase as do record sales attractiveness of band increases record sales
449
The b-values also tell us, in addition to direction of relationship (pos/neg) , to what degree each in multiple regression
predictor affects the outcome if the effects of all other predictors are held constant:
450
B-values tell us to what degree each predictor affects the outcome if the effects of all other predictors held constant in multiple regression e.g., advertising budget - (3)
(b = 0.085): This value indicates that as advertising budget (x) increases by one unit, record sales (outcome, y) increase by 0.085 units. This interpretation is true only if the effects of attractiveness of the band and airplay are held constant.
451
Standardised versions of b-values are much more easier to interpret as in muliple regression
not dependent on the units of measurements of variables
452
The standardised beta values tell us that in multiple regression
the number of standard deviations that the outcome will change as a result of one standard deviation change in the predictor.
453
The standardized beta values are all measured in standard deviation units and so are directly comparable: therefore, they provide a in MR
a better insight into the ‘importance’ of a predictor in the mode
454
If two predictor variables (e.g., advertising budget and airplay) have virtually identical standardised beta values (0.512, and 0.511) it shows that in MR
both variables have a comparable degree of importance in the model
455
If we collected 100 samples and in MR calculated CI for b, we are saying that 95% of these CIs of samples would contain the
true (pop) value of b
456
A good regression model will have a narrow and small CI interval indicating in MR
value of b in this sample is close to the true value of b in the populatio
457
A bad regression model have CI that cross zero indicating that in MR
in some samples the predictor has a negative relationship to the outcome whereas in others it has a positive relationship
458
In image below, which are the two best predictors based on CIs and one that isn't as (2) in MR
two best predictors (advertising and airplay) have very tight confidence intervals indicating that the estimates for the current model are likely to be representative of the true population values interval for attractiveness is wider (but still does not cross zero) indicating that the parameter for this variable is less representative, but nevertheless significant.
459
If you do part and partial correlations in descriptive box, there will be another coefficients table which looks this in MR like:
460
The zero-order correlations are the simple in MR
Pearson's correlation coefficients
461
The partial correlations represent the in MR
represent the relationships between each predictor and the outcome variable, controlling for the effects of the other two predictors.
462
The part correlations in MR - (2)
represent the relationship between each predictor and the outcome, controlling for the effect that the other two variables have on the outcome. representing the unique relationship each predictor has with otucome
463
Partial correlations in example is calculated by in MR- (2)
unique variance in outcome (ignore all other predictors) explained by predictor divided by variance in outcome not explained by all other predictors A/A+E
464
Part correlations are calculated by - (2) in MR
unique variance in outcome explained by predictor divided by total variance in outcome A/A+B+C+E
465
If the average VIF is substantially greater than 10 then the MR regression
may be biased
466
MR Tolerance below 0.1 indicates a
serious problem.
467
Tolerance below 0.2 indicates a in MR
a potential problem
468
How to interpret this image in terms of colinearity - VIF and tolerance in MR
For our current model the VIF values are all well below 10 and the tolerance statistics all well above 0.2; therefore, we can safely conclude that there is no collinearity within our data.
469
We can produce casewise diagnostics to see a in MR to see (2)
summary of residuals statistics to be examined of extreme cases To see whether individual scores (cases) influence the modelling of data too much
470
SPSS casewise diagnostics shows cases that have a standardised residuals that are in MR (2)
less than -2 or greater than 2 (We expect about 5% of our cases to do tha and 95% to have standardised residuals within about +/- 2.)
471
If we have a sample of 200 then expect about .. to have standardised residuals outside limits in MR
10 cases (5% of 200)
472
What does this casewise diagnostic show? - (2) MR
* 99% of cases should lie within ±2.5 so expect 1% of cases lie outside limits * From cases listed, clear two cases (1%) lie outside of limits (case, 164 [investigate further has residual 3] and 179) - 1% which isconform to accurate model
473
If there are many more cases we likely have (more than 5% of sample size) in case wise then in MR
broken the assumptions of the regression
474
If cases are a large number of standard deviations from the mean, we may want to in casewise diagnostics in MR
investigate and potentially remove them because they are ‘outliers’
475
Assumptions we need to check for MR - (8)
* Continous outcome variable and continous or dichotomous predictor variables * Independence = all values of outcome variable should come from different participant * Non-zero variance as predictors should have some variation in value e.g., variance ≠ 0 * No outliers * No perfect or high collinearity * Histogram to check for normality of errors * Scatterplot of ZRES against ZPRED to check for linearity and homoscedasticity = looking for random scatter * Independent errors (Durbin-Watson)
476
Diagram of assumption of homoscedasticity and linearity of ZRESID againsr ZPRED in MR
477
Obvious outliers on a partial plot represent cases that might have in MR
undue influence on a predictor’s b coefficient
478
What does this partial plot show? - (2) in MR
the partial plot shows the strong positive relationship to album sales. There are no obvious outliers and the cloud of dots is evenly spaced out around the line, indicating homoscedasticity.
479
What does this plot show in MR(2)
the plot again shows a positive relationship to album sales, but the dots show funnelling, There are no obvious outliers on this plot, but the funnel-shaped cloud indicates a violation of the assumption of homoscedasticity.
480
P plot and histogram of normally distributed in MR
481
P plot for skewed distirbution histogram for MR
482
What if assumptions for regression is volated? in MR
you cannot generalize your findings beyond your sample
483
If residuals show problems with heteroscedasticity or non-normality then try to in MR
transforming the raw data – but this won’t necessarily affect the residuals!
484
If you have a violation of the linearity assumption then you could see whether you can in MR do l
logistic regression instead
485
If R^2 is 0.374 (outcome var in productivity and 3 predictors) then it shows that in MR
37.4% of the variance in productivity scores was accounted for by 3 predictor variables
486
- In ANOVA table, tells whether model is sig improved from baseline model which is in MR
if we assumed no relation between predictor variables and outcome variable – flat regression line no association between these variables)
487
This table tells us in terms of standardised beta values that (outcome is productivity in MR)
holidays had standardized beta coefficient of 0.031 whereas cake had a much higher standardized beta coefficient of 0.499 which tells us that amount of cake given out much better predictor of productivity than the amount of holidays taken For pay we have a beta coefficient of 0.323 which tells us that pay was also a pretty good predictor in the model of productivity but slightly less than cake
488
What does this table tells us in terms of signifiance? - (3) in MR
- P value for holidays is 0.891 which is not significant - P value for cake is 0.032 is significant - P value for pay is 0.012 is significant
489
In ANOVA it is comparing M2 with all its predictor variables with in MR
baseline not M1
490
To see if M2 is an improvement of M1 in HR we need to look at ... in model summary in MR
change statistics
491
What does this change statistic show in terms of M2 and M1 in MR
M2 explains an extra 7.5% which is sig
492
In MR, the smaller value of sig, the larger value of t the greater
contribution of that predictor.
493
For this output interpret whether predicotrs are sig predictors of record scales and magnitude t statistic on impact of record sales in MR - (2)
For this model, the advertising budget (t(196) = 12.26, p < .001), the amount of radio play prior to release (t(196) = 12.12, p < .001) and attractiveness of the band (t(196) =4.55, p < .001) are all significant predictors of record sales. From the magnitude of the t-statistics we can see that the advertising budget and radio play had a similar impact, whereas the attractiveness of the band had less impact.
494
What is example of contintous variable?
we are talking about a variable with a infinante number of real numbers within a given interval so something like height or age
495
What is an example of dichotomous variable?
variable that can only hold two distinct values like male and female
496
If outliers are present in data then impact the
line of best fit in MR
497
You would expect that 1% of cases to lie outside the line of best fit so in large sample if you have in MR
one or two outliers then could be okay
498
Rule of thumb to check for outliers is to check if there are any data points that in MR
are over 3 SD from the mean
499
All residuals should lie within ..... SDs for no outliers /normal amount of outliers in MR
-3 and 3 SD
500
Which variables (if any) are highly correlated in MR?
Weight, Activity, and the interaction between them are statistically significant
501
What does homoscedasticity and hetrodasticity mean in MR? - (2)
Homoscedasticity: similar variance of residuals (errors) across the variable continuum, e.g. equally accurate. Heteroscedasticity: variance of residuals (errors) differs across the variable continuum, e.g. not equally accurate
502
P plot plots a normal distribution against
your distribution
503
Diagram of normal, skewed to left (pos) and skewed to right (neg) of p-plots in MR
504
Durbin-Watson test values of 0,2,4 show that... in MR- (3)
* 0 = errors between pairs of obsers are pos correl * 2 = independent error * 4 = errors between pairs of observs are neg correl
505
A Durbin-Watson statistic between ... and ... is considered to indicate that the data is not cause for concern = independent errors in MR
1.5 and 2.5
506
If R2 and adjusted R2 are similar, it means that your regression model
‘generalizes’ to the entire population.
507
If R2 and adjusted R2 are similar, it means that your regression model ‘generalizes’ to the entire population. Particularly for MR
for small N and where results are to be generalized use the adjusted R2
508
3 types of multiple regression - (3)
1. Standard: To assess impact of all predictor variables simultaneously 2. Hierarchical: To test predictor variables in a specific order based on hypotheses derived from theory 3. Stepwise: If the goal is accurate statistical prediction from a large number of predictor variables – computer driven
509
Diagram of excluded variables table in SPSS - (3) in MR
* Tells that OCD interpretiotn of intrustrions would have not have a significant impact on model's ability to predict social anxiety Beta value of Interpretation of Intrusions is very small, indicating small influence on outcome variable Beta is the degree of change in the outcome variable for every 1 unit of change in the predictor variable.
510
What is multicollinearity in MR
When predictor variables correlate very highly with each other
511
When checking assumption fo regression, what does this graph tell you in MR
Normality of residuals
512
Which of the following statements about the t-statistic in regression is not true? The t-statistic is equal to the regression coefficient divided by its standard deviation The t-statistic tests whether the regression coefficient, b, is significantly different from 0 The t-statistic provides some idea of how well a predictor predicts the outcome variable The t-statistic can be used to see whether a predictor variables makes a statistically significant contribution to the regression model
The t-statistic is equal to the regression coefficient divided by its standard deviation
513
A consumer researcher was interested in what factors influence people's fear responses to horror films. She measured gender and how much a person is prone to believe in things that are not real (fantasy proneness). Fear responses were measured too. In this table, what does the value 847.685 represent in MR
The residual error in the prediction of fear scores when both gender and fantasy proneness are included as predictors in the model.
514
A psychologist was interested in whether the amount of news people watch predicts how depressed they are. In this table, what does the value 3.030 represent in MR
The improvement in the prediction of depression by fitting the model
515
A consumer researcher was interested in what factors influence people's fear responses to horror films. She measured gender (0 = female, 1 = male) and how much a person is prone to believe in things that are not real (fantasy proneness) on a scale from 0 to 4 (0 = not at all fantasy prone, 4 = very fantasy prone). Fear responses were measured on a scale from 0 (not at all scared) to 15 (the most scared I have ever felt). Based on the information from model 2 in the table, what is the likely population value of the parameter describing the relationship between gender and fear in MR
Somewhere between −3.369 and −0.517
516
A consumer researcher was interested in what factors influence people's fear responses to horror films. She measured gender (0 = female, 1 = male) and how much a person is prone to believe in things that are not real (fantasy proneness) on a scale from 0 to 4 (0 = not at all fantasy prone, 4 = very fantasy prone). Fear responses were measured on a scale from 0 (not at all scared) to 15 (the most scared I have ever felt). How much variance (as a percentage) in fear is shared by gender and fantasy proneness in the population in MR
13.5%
517
Recent research has shown that lecturers are among the most stressed workers. A researcher wanted to know exactly what it was about being a lecturer that created this stress and subsequent burnout. She recruited 75 lecturers and administered several questionnaires that measured: Burnout (high score = burnt out), Perceived Control (high score = low perceived control), Coping Ability (high score = low ability to cope with stress), Stress from Teaching (high score = teaching creates a lot of stress for the person), Stress from Research (high score = research creates a lot of stress for the person), and Stress from Providing Pastoral Care (high score = providing pastoral care creates a lot of stress for the person). The outcome of interest was burnout, and Cooper’s (1988) model of stress indicates that perceived control and coping style are important predictors of this variable. The remaining predictors were measured to see the unique contribution of different aspects of a lecturer’s work to their burnout. Which of the predictor variables does not predict burnout in MR
Stress from research
518
Using the information from model 3, how would you interpret the beta value for ‘stress from teaching’ in MR
As stress from teaching increases by one unit, burnout decreases by 0.36 of a unit.
519
How much variance in burnout does the final model explain for the sample in MR
80.3%
520
A psychologist was interested in predicting how depressed people are from the amount of news they watch. Based on the output, do you think the psychologist will end up with a model that can be generalized beyond the sample?
No, because the errors show heteroscedasticity.
521
Diagram of no outliers for one assumption of MR
Note that you expect 1% of cases to lie outside this area so in a large sample, if you have one or two, that could be ok
522
Example of multiple regression - (3)
A record company boss was interested in predicting album sales from advertising. Data 200 different album releases Outcome variable: Sales (CDs and Downloads) in the week after release Predictor variables The amount (in £s) spent promoting the album before release Number of plays on the radio
523
R is the correlation between
observed values of the outcome, and the values predicted by the model.
524
Output diagram what does output show in MR? - (2)
Difference between no predictors and model 1 (a). Difference between model 1 (a) and model 2 (b). Our model 2 is significantly better at predicting the value of the outcome variable than the null model and model 1 (F (2, 197) = 167.2, p<.001) and explains 66% of the variance in our data (R2=.66
525
What does this output show in terms of regression model in MR? - (3)
y = 0.09x1 + 3.59x2 + 41.12 For every £1,000 increase in advertising budget there is an increase of 87 record sales (B = 0.09, t = 11.99, p<.001). For every number of plays on Radio 1 per week there is an increase of 3,589 record sales (B = 3.59, t = 12.51, p<.001).
526
Report R^2, F statistic and p-value to 2DP for overall model - (3)
o R squared = 0.09 o F statistic = 22.54 o P value = p < 0.001
527
Report beta and b values for video games, resitrctions and parental aggression to 2DP and p-value in MR
528
Which of the following assumptions of homosecdasticity and linearity is correct? AThere is non-linearity in the data BThere is heteroscedasticity in the data CThere is both heteroscedasticity and non-linearity in the data DThere are no problems with either heteroscedasticity or non-linearity
D - data poiints show random pattern
529
Determine the proportion of variance in salary that the number of years spent modelling uniquely explains once the models' age was taken into account: Hierarchical regression A 2.0% b17.8% c39.7% d42.2%
A -->The R square change in step 2 was .020,
530
Test for multicollinearity (select tolerance and VIF statistics). Based on this information, what can you conclude about the suitability of your regression model? AThe VIF statistic is above 10 and the tolerance statistic is below 0.2, indicating that there is no multicolinearity. BThe VIF statistic is above 10 and the tolerance statistic is below 0.2, indicating that there is a potential problem withmulticolinearity. CThe VIF statistic is below 10 and the tolerance statistic is above 0.2, indicating that there is nomulticolinearity. DThe VIF statistic is below 10 and the tolerance statistic is above 0.2, indicating that there is a potential problem withmulticolinearity.
B
531
Example of question using hierarchical regression - (2)
A fashion student was interested in factors that predicted the salaries of catwalk models. He collected data from 231 models. For each model he asked how much they earned per day (salary), their age (age), and how many years they had worked as a model (years_modelling). The student wanted to know if the number of years spent modelling predicted the models' salary after the models' age was taken into account.
532
The following graph shows: A. Regression assumptions met B. Non-linearity = could indicate curve C. Hetrodasticity + Non-linearity D. hetrodasticity
A
533
A consumer researcher was interested in what factors influence people's fear responses to horror films. She measured gender (0 = female, 1 = male) and how much a person is prone to believe in things that are not real (fantasy proneness) on a scale from 0 to 4 (0 = not at all fantasy prone, 4 = very fantasy prone). Fear responses were measured on a scale from 0 (not at all scared) to 15 (the most scared I have ever felt). What is the likely population value of the parameter describing the relationship between gender and fear?
Somewhere between 3.369 and 0.517
534
What are the 3 types of t-tests? - (3)
1. One-samples t-test 2. Paired t-test 3. Independent t-test
535
Whats a one-sample t-test?
Compares the mean of the sample data to a known value
536
What is the assumptions of one-sample t-test? - (4)
* DV = Continous (interval or ratio) * Independent scores (no relation between scores on test variable) * Normal distribution via frequency histogram (normal shape) and Q-plot (straight line) and non significant Shaprio Wilk * Homogenity of variances
537
Example of one-sample t-test RQ - (2)
Is the average IQ of Psychology students higher than that of the general population (100)? A particular factory's machines are supposed to fill bottles with 150 millilitres of product. A plant manager wants to test a random sample of bottles to ensure that the machines are not under- or over-filling the bottles.
538
What is the assumptions of independent samples t-tests (listing all of them) - (7)
1. Independence. – no relationship between the groups 2. Normal distribution via frequency histogram (normal shape) and Q-plot (straight line) and non significant Shaprio Wilk 3. Equal variances 4. Homogeneity of variances (i.e., variances approximately equal across groups) via non significant Levene's test 5. DV = Interval or continuous 6. IV = Categorical 7. No significant outliers
539
What is an RQ example of independent samples t-tesT?
Do dog owners in the country spend more time walking their dogs than dog owners in the city?
540
What is the assumptions of paired t-test? (listing all) - 3
DV is continuous Related samples: The subjects in each sample, or group, are the same. This means that the subjects in the first group are also in the second group Normal distribution via frequency histogram (normal shape) and Q-plot (straight line) and non significant Shaprio Wilk
541
What is an example of RQ of paired t-test?
Do cats learn more tricks when given food or praise as positive feedback?
542
What is the decision framework for choosing a paired-sample (dependent) t-test? - (5)
1. What sort of measurement = continous 2. How many predictor variables = one 3. What type of predictor variables = categorical 4. How many levels of categorical predictor = two 5. Same or different participants for each predictor level = same
543
What is the decision framework for choosing independent-t-test? (5)
1. What sort of measurement = continous 2. How many predictor variables = one 3. What type of predictor variables = categorical 4. How many levels of categorical predictor = two 5. Same or different participants for each predictor level = different
544
If we are comparing differences between means of two groups in independent/paired t-test then all we are doing is
predicting an outcome based on membership of two groups
545
Indepdnent and paired t-tests can fit into an ideal of a
linear model
546
The t-distributed is defined by its
degrees of freedom - related to the sample size.
547
The t distribution has heavier tails for - (2)
lower degrees of freedom (small N studies) increased uncertainty and a higher likelihood of observing extreme values than large N studies with less heavy tails as t distribution goes to normal
548
Independent and Paired T-tests have one predictor (X) variable with 2 levels and only .... outcome variable (Y)
one
549
When is an independent-means t-test used?
When 2 experimental conditions and different participants are assigned to each conditiont
550
What is independent-means t-test sometimes called as well?
independent-samples t-test
551
When is a dependent-means t-test used?
Used when there are 2 experimental conditions and same participants took part in both conditions of the experiment
552
What is dependent-means t-test sometimes referred to?
Matched pairs or paired samples t-test
553
For independent and paired t-tests we compare between the sample means that we collected to the difference between sample means that we would expect if
there was no effect (i.e., null hypothesis was true)
554
Formula of calculating t- test statistic (form depend on whether same or different participants used in each experimental condition) in independent/paired
555
Formula of calculating t-statistic shows obtaining t-test statistic by diving the model/effect by the in independent /apried
error in the model
556
Expected difference in calculating t-test statistic in most cases is
0 - expect differences between sample group means we colelcted to be different to 0
557
If observed difference between sample means get larger in t-tests then more confident we become that
null hypothesis is rejected and two sample means differ because of experimental manipulation
558
Both independent t-test and paired t-test are ... tests based on normal distribution
parametric tests
559
Since independent and paired t-tests are parametric tests they assume that the - (2)
* Sampling distribution is normally distributed - in paired it means sampling distribution of differences of scores is normal not the socres itself! * Data measured at least interval level
560
Since independent-tests used to test different groups of people it also assumes - (2)
* Variances in populations are roughly equal (homegenity of variance) = Leven's test * Scores are independent since they come from different people
561
Diagram of equation of calculating t-statistic from paired t-test and explain - (2)
* Compares mean differences betwen our samples (--D) to the differences we would expect to find between population means (uD) which is divided by standard error of differences (sD / square root N) * If H0 is ture, then expect no difference between population means hence uD = 0
562
A small standard error of differences tells us that in paired-t-test
pairs of samples from a population have similar means to population
563
A large standard error of differences tells us that in paired t-test - (2)
that sample means can deviate quite a lot from the populatio mean and sampling distribution of differences is more spread out
564
The average difference between person's socre in condition 1 and condition 2 -(¯D) in paired t-test is an indicator of
systematic variation in the data (represents experimental effect)
565
If average differences (--D) between our samples is large and standard error of differences is small in paired-t test then we can be confident that
the difference we observed in our sample is not a chance result and caused by experimental manipulation
566
How do we normally calculate the standard error?
SD divided by square root of sample size
567
How to calcuate the standard error of differences in paired-test?(σ --D)
Standard deviation of differences divided by square root of sample size
568
the t-statistic in paired t-test is
ratio of systematic variation in experiment (average difference D) and unsystematic variation (standard erro of differences)
569
When would we expect t statistic greater than 1 in paired-t-test equation?
If the experimental manipulation creates any kind of effect,
570
When would we expect t statistic less than 1 in paired t-test equation?
If the experimental manipulation is unsuccessful then we might expect the variation caused by individual differences to be much greater than that caused by the experiment
571
In pairered and generally independent t-tests we can compare the obtainee value of t against thmaximum value we would expect to get by chance alone in t distribution with same DFs and if value we obtain exceeds the
critical value then conflict if reflects an effect in our IV
572
What does this paired samples correlation show?
people doing well in first exam likely doing well in second exam regardless of condition they are in and significantly correlated (r= 0.664)
573
What does this SPSS output show? = paired t- test
t(19) = 2.72, p = 0.012
574
What does negative t-value mean? paired t-test.
First condition had smaller mean than second condition
575
What does 95% confiderence interval of difference mean in SPSS output of paired t-test?- (3)
* 95% of the samples (e.g., if we had 100 samples then 95 of those samples..) the constructucted CIs contain true value (population) of the mean difference * CIs tell us boundaries within which true mean difference is likely to lie * The true value of mean difference is unlikely to be 0 if Cis does not contain 0
576
How to calculate effect size for independent and paired t-tests?
Using cohen's D
577
Diagram of calculating Cohen's D Statistic for sleep vs no sleep for paired
Minus big mean from small mean divided by smallest SD (control group)
578
What does Cohen's d of 0.20 represent
difference between groups is a 1/5 of SD
579
Diagram of writing up paired t-test result
580
To calculate effect size for independent and paired t-tests, beside Cohen's D, we can also
calculate effect size r (above 0.50 is large effect) by converting t-value to r-value
581
With independent t-test there are two different equations that can be used depending on whether the samples
contain an equal number of people
582
With independent t-test since different participants participate in different condition, the pairs of scores will differ not just of experimental manipulation but also because of
other sources of variance (such as individual differences between participants' motivation, IQ etc..)
583
With dependent t-test we look at differences between pairs of scores because
scores came from same participant and so individual differences were eliminated
584
Equation of independent t-test of equal N sizes for each condition
585
Equation of independent t-test of equal N sizes becomes like the final form since - (3)
* We are looking at differences between the overall means of 2 samples and compare with differences we would expect to get between means of 2 populations from which sampels come from * If H0 is true, samples drawn from same population * Therefore under H0, u1 = u2 therefore u1 - u2 = 0
586
Equation of independent t-test in numbers for equal N sizes
587
We use variance of sum law to obtain the estimate of standard error for each ... in independent t-test equation for equal N sizes
sample group
588
What does variance sum of law state?
variance of the sampling distribution is equal to the sum of the variances of the two populations from which the samples were taken
589
This independent t-test standard error formula combines the
standard error for two samples
590
In independent t-test when we want to compare two groups that contain different number of participants then equation ... is not appropriate
591
For comparing two groups with unequal number of participants in independent t-test then we use the
pooled variance estimate t-test
592
The pooled variance estimate t-test is used which takes into account of the
differnece in sample size by weighting the variance of each sample
593
Formula of pooled variance estimate t-test - (2)
Each variance of sample is multipled by its DF and added together and divided by the sum of weights (sum of two DFs) Larger samples better than small ones as close to population
594
In formula of pooled variance estimate t-test it weights the variance of each sample by the
number of degrees of freedom (N-1)
595
As with dependent t-test we compare obtained value of t in independent sample against the
maximum value we would expect to get by chance alone in t distribution with same DFs
596
What does this output show? - in independent t-test - (2)
Sleep condition scored an average exam score of 66.200 and no sleep condition earned an average of 58.73 Effect size (Cohen's D) = Mean of sleep minus mean of no sleep divided by standard deviation of sleep (control grp) = 66.20-58/73/7.12
597
In independent samples t-test we check for Levene's test for quality of variances which determine whether
we got equal variance across the groups or whether the variances are unequal
598
In independent t-test, Levene's test we are looking for a non-significant p-value which shows that
no statistically significant difference in variances between the two groups - report results from equal variances assumed
599
In independent t-test if Levene's test was significant then it means that
variances between the 2 groups are different and they are statistically significantly different - report data from equal variances not assumed
600
What does this output show in independent t-test? - (2)
* Levene's test is not significant (p = 0.970) so no stats sig differences in variance between two groups * t(28) = 2.87, p = 0.008
601
Diagram of reporting independent t-test
602
Paired vs independent t-tests - who has better power?
Paired t-t ests
603
Since paired-t-tests use same participants across conditions the ... is reduced dramatically compared to independent t-test
unsystematic variance
604
The non-parametric counterpart of dependent t-test is called
Wilcoxon signed rank test
605
The non-parametric tests of the independent t-test is
Wilcoxon rank sum test and Mank Whitney test
606
What does this SPSS output of independent-test of Levene's show
homogeneity of variance as assessed by Levene's Test for Equality of Variances (F = 1.58, p = .219)
607
Cohen’s d for diet was 4.25 Is this a: Small effect Medium effect Large effect
Large effect
608
The probability of a value of t occurring yields the p value for the difference between the means occurring by
chance
609
Although there are 2 ways to compute effect, use
Cohen D
610
Another example of two samples independent t-test scenario RQ Sample DV Hypothesis Test Sig - (6)
Research question: Which of the two diet formulas is better for puppies? Sample: 15 were randomly assigned to each of the two diets (A and B). Dependent variable: Average daily weight gain (ADG, g/day) between 12 to 28 weeks of age. Hypotheses: Ho: µA = µB Ha: µA ≠ µB. Statistical Test: Two samples independent t-test Significance level: .05
611
We can check if there is no outliers in independent t-test by looking at
boxplots - no outlier here
612
To check normality of distribution for both independent groups for two-samples independent t-test, we can use..
histogram, q-qplot and tests of normality
613
Checking normality for independent, is it - (3) Research question: Which of the two diet formulas is better for puppies? Dependent variable: Average daily weight gain (ADG, g/day)
We don’t have sig values for either group in the test of normality, histogram and plots look normal So we have normality of distribution for both independent groups Inspection of Q-Q Plots and the non-significant Shapiro-Wilk tests (p > .05) indicate that the ADG is normally distributed for both groups
614
For checking homogeneity of variances in independent/paired we use
levene's test
615
Checking homogenity of variance in this two-sample independent t-test, what does it show? Research question: Which of the two diet formulas is better for puppies? Dependent variable: Average daily weight gain (ADG, g/day)
was homogeneity of variance as assessed by Levene's Test for Equality of Variances (F = 1.58, p = .219)
616
What does results of two-sample independent results show? Research question: Which of the two diet formulas is better for puppies? Dependent variable: Average daily weight gain (ADG, g/day)
This study found that puppies in diet B had statistically significantly higher average daily weight gain (89.29 ± 9.93 g/day) between 12 and 28 weeks of age compared to puppies in diet A (60.20 ± 6.85 g/day), t(27)= -9.24, p < .001.
617
In Cohen's D theoretically 3 SDs can be used - (3) which make very little difference
1. Pooled SD (over conditions) 2.Averaged SD 3. Control group SD
618
To calculate Cohen D for independent/paired t-test we need to use
control group SD
619
How to calculate Cohen's D for independent two samples t test for this group? Research question: Which of the two diet formulas is better for puppies? Dependent variable: Average daily weight gain (ADG, g/day) - (2)
d = (89.29 - 60.20) / 6.85 d = 4.25
620
Cohen's D guidelines for small, medium large - (3)
d = 0.2 be considered a 'small' effect size, d = 0.5 represents a 'medium' effect size d = 0.8 a 'large' effect size
621
What does ANOVA stand for?
Analysis of Variance
622
# What What is the decision tree for choosing a one-way ANOVA? - (5)
Q: What sort of measurement? A: Continuous Q:How many predictor variables? A: One Q: What type of predictor variable? A: Categorical Q: How many levels of the categorical predictor? A: More than two Q: Same or Different participants for each predictor level? A: Different
623
When does ANOVA be used?
if you are comparing more than 2 groups in IV
624
Example of ANOVA RQ
Which is the fastest animal in a maze experiment - cats, dogs or rats?
625
We can't do three separate t-tests for example what is the fastest animal in a maze experiment - cats, dogs or rats as - (2)
Doing separate t-tests inflates the type I error (false positive - e.g., pregnant man) The repetition of the multiple tests adds multiple chances of error, which may result in a larger α error level than the pre-set α level - Family wise error
626
What is familywise or experimentwise error rate?
This error rate across statistical tests conducted on the same experimental data
627
Family wise error is related to
type 1 error
628
What is the alpha level probability
probability of making a wrong decision in accepting the alternate hypothesis = type 1 error
629
If we conduct 3 separate t-tests to test the comparison of which is the fastest animal in experiment - cats, dogs or rats with alpha level of 0.05 - (4)
* 5% of type 1 error of falsely rejecting H0 * Probability of no. of Type 1 errors is 95% for a single test * However, for multiple tests the probability of type 1 error decreases as 3 tests together => 0.95*0.95*0.95 = 0.857 * This means probability of a type 1 error increases: 1- 0.857 = 0.143 (14.3% of not making a type 1 error)
630
Much like model for t-tests we can write a general linear model for
ANOVA - 3 levels of categorical variable with dummy variables
631
When we perform a t-test, we test the hypothesis that the two samples have the same
mean
632
ANOVA tells us whether three or more means are the same so tests H0 that
all group means are equal
633
An ANOVA produces an
F statistic or F ratio
634
The F ratio produced in ANOVA is similar to t-statistic in a way that it compares the
amount of systematic variance in data to the amount of unsystematic variance i.e., ratio of model to its error
635
ANOVA is an omnibus test which means it tests for and tells us - (2)
overall experimental effect tells whether experimental manipulation was successful
636
An ANOVA is omnibus test and its F ratio does not provide specific informaiton about which
groups were affected due to experimental manipulation
637
Just like t-test can be represented by linear regression equation, ANOVA can be represented by a
multiple regression equation for three means and models acocunt for 3 levels of categorical variable with dummy variables
638
As compared to independent samples t-test that compares means of two groups, one-way ANOVA compares means of
3 or more independent groups
639
In one-way ANOVA we use ... ... to test assumption of equal variances across groups
Levene's test
640
What does this one-way ANOVA output show?
Leven's test is non-significant so equal variances are assumed
641
What does this SPSS output show in one-way ANOVA?
F(2,42) = 5.94, p = 0.005, eta-squared = 0.22
642
How is effect size (eta-squared) calculated in one-way ANOVA?
Between groups sum of squares divided by total sum of squares
643
What is the eta-squared/effect size for this SPSS output and what does this value mean? - (2)
830.207/3763.632 = 0.22 22% of the variance in exam scores is accounted for by the model
644
Interpreting eta-squared, what does 0.01, 0.06 and 0.14 eta-sqaured in one way ANOVA means? - (3)
1. 0.01 = small effect 2. 0.06 = medium effect 3. 0.14 = large effect
645
What happens if the Levene's test is significant in the one-way ANOVA?
then use statistics in Welch or Brown-Forsythe test
646
The Welch or Brown-Forsythe test make adjustements to DF which affects in one way ANOVA if Levene's test is sig
statistics you get and affect if p value is sig or not
647
What does this post-hoc table of Bonferroni tests show in one-way ANOVA ? - (3)
* Full sleep vs partial sleep, p = 1.00, not sig * - Full sleep vs no sleep , p = 0.007 so sig * - Partial sleep vs no sleep = p = 0.032 so sig
648
Diagram of example of grand mean
Mean of all scores regardless pp's condition
649
What are the total sum of squares (SST) in one-way ANOVA?
difference of the participant’s score from the grand mean squared and summed over all participants
650
What is model sum of squares (SSM) in one-way ANOVA?
difference of the model score from the grand mean squared and summed over all participants
651
What is residual sum of squares (SSR) in one-way ANOVA?
difference of the participant’s score from the model score squared and summed over all participants
652
The residuals sum of squares (SSR) tells us how much of the variation cannot be
explained by the model and amount of variation caused by extraneous factors
653
We divide each sum of squares by its
DF to calculate them
654
For SST its DF we divide by is in one-way ANOVA
N-1
655
For SSM its DF we divide by is one-way ANOVA so
number of group (parameters), k,
656
For SSM if we have this design then... then its DF will be in one way ANOVA
3-1 = 2
657
For SSR we divivde by its DF to calculate which will be the in one way ANOVA
total sample size, N, minus the number of groups, k
658
Formulas of dividing each sum of squares by its DF to calculate it in one way ANOVA- (3)
* MST = SST (N-1) * MSR = SSR (N-k) * MSM = SSM/k
659
SSM tells us the total variation that the
exp manipulation explains
660
What does MSM represent?
average amount of variation explained by the model (e.g. the systematic variation),
661
What does MSR represent?
average amount of variation explained by extraneous variables (the unsystematic variation).
662
The F ratio in one-way ANOVA can be calculated by
663
If F ratio in one-way ANOVA is less than 1 then it represents a
non-significant effect
664
Why F less than 1 in one-way ANOVA represents a non-significant effect?
F ratio is less than 1 means that MSR is greater than MSM = more unsystematic than systematic
665
If F is greater than 1 in one-way ANOVA then shows likelhood ... but doesn't tell us - (2)
indicates that experimental manipulation had some effect above and beyond effect of individual differences in performance Does not tell us whether F-ratio is large enough to not be a chance result
666
When F statistic is large in one-way ANOVA then it tells us that the
MSM is greater than MSR
667
To discover if F statistic is large enough not to be a chance result in one-way ANOVA then
compare the obtained value of F against the maximum value we would expect to get by chance if the group means were equal in an F-distribution with the same degrees of freedom
668
High values of F are rare by in one way ANOVA are rare - (3)
by chance . Low degrees of freedom result in long tails of the distribution, so much like other statistics large values of F are more common to crop up by chance in studies with low numbers of participants.
669
The F-ratio tells us in one-way ANOVA whether model fitted to data accounts for more variation thane extraneous and does not tell us where
differences between groups lie
670
If F-ratio in one-way ANOVA is large enough to be statistically significant then we know
that one or more of the differences between means is statistically significant (e.g. either b2 or b1 i statistically significant)
671
It is necessary after conducting an one-way ANOVA to carry out further analysis to find out
which groups differ
672
The power of F statistic is relatively unaffected by
non-normality
673
when group sizes are not equal the accuracy of F is
affected by skew, and non-normality also affects the power of F in quite unpredictable ways
674
When group sizes are equal, the F statistic can be quite robust to
violations of normality
675
What tests do you do after performing a one-way ANOVA and finding significant F test? - (2)
* Planned contrasts * Post-hoc tests
676
What do post-hoc tests do? - (2)
* compare all pairwise differences in mean * Used if no specific hypotheses concerning differences has been made
677
What is the issue with post-hoc tests?
* because every pairwise combination is considered the type 1 error rate increases, so normally the type 1 error rate is reduced by modifying the critical value of p
678
Post-hoc tests are like two or one tailed hypothesis?
two-tailed
679
Planned contrasts are like one or two-tailed hypothesos?
One-tailed hypothesis
680
What is the most common modification of the critical value for p in post-hoc in one-way ANOVA?
Bonferroni correction, which divides the standard critical value of p=0.05 by the number of pairwise comparisons performed
681
Planned contrasts are used to investigate a specific
hypothesis
682
Planned contrasts do not test for every
pairwise difference so are not penalized as heavily as post hoc tests that do test for every difference
683
With planned contrasts test you dervivie the hypotheses before the
data is collected
684
In planned contrasts when one condition is used it is
never used again
685
In planned contrasts the number of independent contrasts you can make can be defined with one way ANOVA
k (number of groups) minus 1
686
How does planned contrasts work in SPSS?
Coefficients add to 0 for each contrast (-2 + 1 +1) and once group used alone in contrast then enxt contrasts set coefficient to 0 (e.g., -2 to 0)|
687
Polynominal contrasts can also look at more complex trends other than linear such as in one way ANOVA?
quadratic, cubic and quartic
688
The Bonferroni post-hoc ensures that the type 1 error is below in one-way ANOVA?
0.05
689
With Bonferroni correction it reduces type 1 (being conserative in type 1 error for each comparison) it also in one way ANOVA?
lacks statistical power (probability of type II error will be high [ false negative]) so increasing chance of missing a genuine difference in data
690
What post hoc-tests to use if you have equal sample sizes and confident that your group variances are similar? in one way ANOVA
Use REGWQ or Tukey as good power and tight control over Type 1 error rate
691
What post hoc tests to use if your sample sizes are slightly different in one way ANOVA?
Gabriel’s procedure because it has greater power,
692
What post-hoc tests to use if your sample sizes are very different? ine one way ANOVA?
if sample sizes are very different use Hochberg’s GT2
693
What post-hoc test to run if Levene's test of homeogenity of variance is significant in one way ANOVA?
Games-Howell
694
# ** What post=hoc test to use if you want gurantee control over type 1 errror rate in one wau ANOVA?
Bonferroni
695
What does this ANOVA error line graph show? - (2)
* Linear trend as dose of Viagra increases so does mean level of libido * Error bars overlap indicating no between group differences
696
What does the within groups gives deails of in ANOVA table?
SSR (unsystematci variation)
697
The between groups label in ANOVA table tells us
SSM (systematic variation)
698
What does this ANOVA table demonstrate? - (2)
* Linear trend is significant (p = 0.008) * Quadratic trend is not significant (p = 0.612)
699
When we do planned contrasts we arrange the weights in such that we compare any group with a positive weight
with a negative weight
700
What does this output show if we conduct two planned comparisons of: one to test whether the control group was different to the two groups which received Viagra, and one to see whether the two doses of Viagra made a difference to libido - (2)
the table of weights shows that contrast 1 compares the placebo group against the two experimental groups, contrast 2 compares the low-dose group to the high-dose group
701
What does this table show if levene's test is non significant =equal variances assumed To test hypothesis that experimental groups would increase libido above the levels seen in the placebo group (one-tailed) To test another hypothesis that a high dose of Viagra would increase libido significantly more than a low dose one-way ANOVA - (3)
Signifiance value given in table is two-tailed and since hypothesis one-tail we divide by 2 for contrast 1, we can say that taking Viagra significantly increased libido compared to the control group (p = .0029/2 = 0.0145) . The significance of contrast 2 tells us that a high dose of Viagra increased libido significantly more than a low dose (p(one-tailed) = .065/2 = .0325)
702
If making a few pairwise comparisons and equal umber of pps in each condition then ... if making a lot then use. in one way ANOVA - (2).
Bonferroni Tukey
703
Assumptions of ANOVA - (5)
* Independence of data * DV is continuous; IV categorical (3 groups) * No significant outliers; * DV approximately normally distributed for each category of the IV * Homogenity of variance = Levene's test not significant
704
ANOVA compares many means without increasing the chance of
type 1 error
705
In one-way ANOVA, we partiton the total variance into
IV and DV
706
An independent t-test is used to test for: A Differences between means of groups containing different participants when the sampling distribution is normal, the groups have equal variances and data are at least interval. B Differences between means of groups containing different participants when the data are not normally distributed or have unequal variances. C Differences between means of groups containing the same participants when the data are normally distributed, have equal variances and data are at least interval. D Differences between means of groups containing the same participants when the sampling distribution is not normally distributed and the data do not have unequal variances.
A differences between means of groups containing different participants when sampling distribution is normal and the groups have equal variances and data are at least interva
707
If you use a piared samples t-test A The same participants take part in both experimental conditions. BThere ought to be less unsystematic variance compared to the independent t-test. C Other things being equal, you do not need as many participants as you would for an independent samples design. D All of these are correct.
D All of these are correct
708
Which of the following statements about the t distribution is correct? A It is skewed BIn small samples it is narrower than the normal distribution CAs the degrees of freedom increase, the distribution becomes closer to normal DIt follows an exponential curve
C As the DF increase, the distribution becomes closer to normal
709
Which of the following sentences is an accurate description of the standard error? AIt is the same as the standard deviation BIt is the observed difference between sample means minus the expected difference between population means (if the null hypothesis is true) CIt is the standard deviation of the sampling distribution of a statistic D It is the standard deviation squared
CIt is the standard deviation of the sampling distribution of a statistic
710
A psychologist was interested in whether there was a gender difference in the use of email. She hypothesized that because women are generally better communicators than men, they would spend longer using email than their male counterparts. To test this hypothesis, the researcher sat by the computers in her research methods laboratory and when someone started using email, she noted whether they were male or female and then timed how long they spent using email (in minutes). Based on the output, what should she report? (NOTE: Check for the assumption of equality of variances). A Females spent significantly longer using email than males, t(14) = –1.90, p = .079 BFemales and males did not significantly differ in the time spent using email,t(7.18) = –1.90,p= .099 CFemales and males did not significantly differ in the time spent using email, t(7.18) = –1.90, p < .003 DFemales and males did not significantly differ in the time spent using email, t(14) = –1.90, p = .079
BFemales and males did not significantly differ in the time spent using email,t(7.18) = –1.90,p= .099
711
Other things being equal, compared to the paired-samples (or dependent)t-test, the independentt-test: A Has more power to find an effect. BHas the same amount of power, the data are just collected differently. CHas less power to find an effect. D Is less robust.
CHas less power to find an effect.
712
Differences between group means can be characterized as a regression (linear) model if: AThe outcome variable is categorical. BThe groups have equal sample size. CThe experimental groups are represented by a binary variable (i.e. code 1 and 0). DThe difference between group means cannot be characterized as a llinear model, they must be analyzed as an independent t-test.
The experimental groups are represented by a binary variable (i.e. code 1 and 0)
713
An experiment was done to look at whether different relaxation techniques could predict sleep quality better than nothing. A sample of 400 participants were randomly allocated to one of four groups: massage, hot bath, reading or nothing. For one month each participant received one of these relaxation techniques for 30 minutes before going to bed each night. A special device was attached to the participant’s wrist that recorded their quality of sleep, providing them with a score out of 100. The outcome was the average quality of sleep score over the course of the month. Which test could we use to analyse these data? A Regression only B ANOVA only C Regression or ANOVA D Chi-square
C (multiple) Regression or ANOVA (independent) as regression and ANOVA is the same Did not mention the hypothesis of prediction or it would be regression Chi-square only used when you have one categorical predictor and outcome is categorical
714
A researcher testing the effects of two treatments for anxiety computed a 95% confidence interval for the difference between the mean of treatment 1 and the mean of treatment 2. If this confidence interval includes the value of zero, then she cannot conclude that there is a significant difference in the treatment means: true or false. TRUE OR FALSE
TRUE
715
The student welfare office was interested in trying to enhance students’ exam performance by investigating the effects of various interventions. They took five groups of students before their statistics exams and gave them one of five interventions: (1) a control group just sat in a room contemplating the task ahead; (2) the second group had a yoga class to relax them; (3) the third group were told they would get monetary rewards contingent upon the grade they received in the exam; (4) the fourth group were given beta-blockers to calm their nerves; and (5) the fifth group were encouraged to sit around winding each other up about how much revision they had/hadn’t done (a bit like what usually happens). The final percentage obtained in the exam was the dependent variable. Using the critical values for F, how would you report the result in the table below? AType of intervention did not have a significant effect on levels of exam performance, F(4, 29) = 12.43, p > .05. BType of intervention had a significant effect on levels of exam performance, F(4, 29) = 12.43, p < .01. CType of intervention did not have a significant effect on levels of exam performance, F(4, 33) = 12.43, p > .01. DType of intervention had a significant effect on levels of exam performance, F(4, 33) = 12.43, p < .01.
Type of intervention had a significant effect on levels of exam performance, F(4, 29) = 12.43, p < .01.
716
Imagine you compare the effectiveness of four different types of stimulant to keep you awake while revising statistics using a one-way ANOVA. The null hypothesis would be that all four treatments have the same effect on the mean time kept awake. How would you interpret the alternative hypothesis? A. All four stimulants have different effects on the mean time spent awake B, All stimulants will increase mean time spent awake compared to taking nothing C. At least two of the stimulants will have different effects on the mean time spent awake D, None of the above
C. At least two of the stimulants will have different effects on the mean time spent awake
717
When the between-groups variance is a lot larger than the within-groups variance, the F-value is ____ and the likelihood of such a result occurring because of sampling error is _____ A small; high B small; low C. large; high D. large; low
D. large; low
718
Subsequent to obtaining a significant result from an exploratory one-way independent ANOVA, a researcher decided to conduct three post hoc t-tests to investigate where the differences between groups lie. Which of the following statements is correct? A. The researcher should accept as statistically significant tests with a probability value of less than 0.016 to avoid making a Type I error B. The researcher should have conducted orthogonal contrasts instead of t-tests to avoid making a Type I error C. This is the wrong method to use. The researcher did not make any predictions about which groups will differ before running the experiment, therefore contrasts and post hoc tests cannot be used D. None of these options are correct
The researcher should accept as statistically significant tests with a probability value of less than 0.016 to avoid making a Type I error
719
A psychologist was looking at the effects of an intervention on depression levels. Three groups were used: waiting list control, treatment and post-treatment (a group who had had the treatment 6 months before). The SPSS output is below. Based on this output, what should the researcher report? A. The treatment groups had a significant effect on depression levels,F(2, 45) = 5.11. B. The treatment groups did not have a significant effect on the change in depression levels,F(2, 35.10) = 5.11. C. The treatment groups did not have a significant effect on depression levels,F(2, 26.44) = 4.35. D. The treatment groups had a significant effect on the depression levels,F(2, 26.44) = 4.35.
D. The treatment groups had a significant effect on the depression levels,F(2, 26.44) = 4.35.
720
Imagine we conduct a one-way independent ANOVA with four levels on our independent variable and obtain a significant result. Given that we had equal sample sizes, we did not make any predictions about which groups would differ before the experiment and we want guaranteed control over the Type I error rate, which would be the best test to investigate which groups differ? A. Orthogonal contrasts B. Helmert C. Bonferroni D. Hochberg’s GT2
C. Bonferroni
721
The student welfare office was interested in trying to enhance students’ exam performance by investigating the effects of various interventions. They took five groups of students before their statistics exams and gave them one of five interventions: (1) a control group just sat in a room contemplating the task ahead (Control); (2) the second group had a yoga class to relax them (Yoga); (3) the third group were told they would get monetary rewards contingent upon the grade they received in the exam (Bribes); (4) the fourth group were given beta-blockers to calm their nerves (Beta-Blockers); and (5) the fifth group were encouraged to sit around winding each other up about how much revision they had/hadn’t done (You’re all going to fail). The student welfare office made four predictions: (1) all interventions should be different from the control; (2) yoga, bribery and beta-blockers should lead to higher exam scores than panic; (3) yoga and bribery should have different effects than the beta-blocker drugs; and (4) yoga and bribery should also differ. Which of the following planned contrasts (with the appropriate group codings) are correct to test these hypotheses? ANSWER 1 ANSWER 2 ANSWER 3 ANSWER 4
ANSWER 1 - sum of all weights should be 0
722
Deciding what post hoc tests to run
723
Example of RQ for one way ANOVA - (3)
Is there a statistically significant difference in Frisbee throwing distance with respect to education status IV = Education with 3 levels = high school, graduate, postgrad DV = Frisbee throwing distance
724
What does this one-way ANOVA output show? Research question: Is there a statistically significant difference in Frisbee throwing distance with respect to education status? Variables: IV - Education, which has three levels: High School, Graduate and PostGrad; DV - Frisbee Throwing Distance
There was homogeneity of variance as assessed by Levene's Test for Equality of Variances (F (2,47) = 1.94, p = .155)
725
What does the results of one-way ANOVA show? Research question: Is there a statistically significant difference in Frisbee throwing distance with respect to education status? Variables: IV - Education, which has three levels: High School, Graduate and PostGrad; DV - Frisbee Throwing Distance
There was a statistically significant difference between groups as demonstrated by one-way ANOVA (F(2, 47) = 3.50, p = .038).
726
What does the results of one-way ANOVA show? --> post hoc Research question: Is there a statistically significant difference in Frisbee throwing distance with respect to education status? Variables: IV - Education, which has three levels: High School, Graduate and PostGrad; DV - Frisbee Throwing Distance
A Tukey post hoc test shows that the PostGrad group was able to throw the frisbee statistically significantly further than the High School group (p = .034). There was no statistically significant difference between the Graduate and High School groups (p = . 691) nor between the Graduate and PostGrad groups (p = .099).
727
What is IV and DV of one -way ANOVA?
IV = 1 predicto Categorical with more than 2 levels DV = 1 Continous
728
one-way ANOVA is also called
between subject
729
regression equation for ANOVA can be extended to include one or more continuous variables that predict the outcome (or dependent variable). these continous variables are not
part of the main experimental manipulation but have an influence on the dependent variable, are known as covariates and they can be included in an ANOVA analysis.
730
What does ANCOVA involve?
When we measure covariates and include them in an analysis of variance
731
Continuous variables, that are not part of the main experimental manipulation (don't want to study them) but have an influence on the dependent variable, are known as
covariates
732
From what we know from hierarchical regression model, if we enter covariate into regression model first then dummy variables representing exp manipulation after... - (2)
then we can see what effect an IV has after the effect of covariate We partial out the effect of covariate
733
What are the two reasons for including covariates in ANOVA? - (2)
* To reduce within-group error variance = if we can explain unexplained variance , SSR, in terms of other variables (covariates)then reduce SSR to accurately assess effects of SSM * Elimination of confoundd = remove bias of unmeasured variables that confound results and influence DV
734
ANCOVOA has same assumptions of ANOVA, e.g., normality and homogenity of variance (Levene's test) expect has two more important assumptions which are... - (2)
* Independence of the covariate and treatment effect * Homogeneity of regression slopes
735
For ANCOVA to reduce within-group variance by allowing the covariate to explain some of the error variance the covariate must be
independent from the experimental/treatment effect - (IVs - categorical predictors) ( ANCOVA assumption)
736
People should no use ANCOVA when the effect of covariate overlaps with the experimental effect as it means the
experimental effect is confounded with the effect of covariate = interpretation of ANCOVA is compromised
737
In ANCOVA, the effect of the covariate should be independent of the
experimental effect
738
When an ANCOVA is conducted we look at the overall relationship between DV and covariate meaning we fit a regression line to
entire dataset and ignore which groups pps fit in
739
When is homogenity of regression slope is not satisifed in ANCOVA?
the relationship between the outcome (dependent variable) and covariate differs across the groups then the overall regression model is inaccurate (it does not represent all of the groups).
740
What is best way to test homeogenity of regression slopes assumption in ANCOVA?
imagine plotting a scatterplot for each experimental condition with the covariate on one axis and the outcome on the other and calculate its regression line
741
Diagram of regression slopes satisfying homogenity of regression slopes in ANCOVA
- exhibits the same slopes for control and 15 minute group
742
Diagram of a regression slopes not satisfying homogenity of regression slopes in ANOCVA
* 30 minutes of therapy exhibts a different slope compared to others
743
What is design, variables and test would you use to test this researh scenario? - (5)
* ANCOVA * Independent samples-design * One IV , two conditions, interval regime and steady state * One covariate (age in years) * One DV (Race time)
744
What does this ANCOVA output show? - IV = Regime --> steady or interval - Covariate = Age - DV = Racetime- (2)
* Age F(1,27) = 5.36, p = 0.028, partial eta-squared = 0.17 (large and sig main effect) * Regime F(1,27) = 4.28, p = 0.048, partial eta-squared = 0.14 (large and sig main effect)
745
What DF do you report from this ANCOVA table for age for example...
DF for age and DF for error
746
Guidelines for interpreting partial eta-squared - (3)
η2 = 0.01 indicates a small effect. η2 = 0.06 indicates a medium effect. η2 = 0.14 indicates a large effect
747
What does this SPSS output for ANCOVA show? - (3)
* Interval has a marginal mean of race times of 56.57 * Steady state has a marginal mean of race times 62.97 * Estimated marginal means partialled out the effects of age and view mean scores of race times in interval and steady state if mean age scores (30.07) across two groups was held constant
748
What does this output show in terms of homogenity of regression slopes? age is covariate and regime is IV and DV is race times - (2)
* Interaction effect of regime * age has a p-value of 0.980 * Since p-value is not significant the assumption of homogeneity of regression slopes has been met
749
What happens if the interaction effect of IV and covariate is significant in testing homogenity of regression slopes
relationship between covariate and DV differ significantly between two groups or many groups you got and assumption is not satisfied
750
For testing assumption of independence of covariate and experimental effect (IV) in SPSS, we need to add
IV (e.g., regime) and covariate (e.g., age) in DV instead of covariate box
751
What does this SPSS output show in terms of independence of covariate and exp effect (IV)? age is covariate (treated as DV) , regime is IV - (2)
* P-value is not signifcant (p=0.528) so effect of variable age is not sig difference of age across training regime * and so independent variable are assumed to be independent.
752
What does positive and negative b-value for covariates in ANCOVA parameter estimate box indicate? - (2)
f the b-value for the covariate is positive then it means that the covariate and the outcome variable have a positive relationship If the b-value is negative it means the opposite: that the covariate and the outcome variable have a negative relationship
753
What does this table of parameter estimates show for ANCOVA where.. DV = PP'sLibido, IV = Dose of Viagara, Covariate is Partner'sLibido - (3)
* b for covariate is 0.416 * Besides other things being equal, if a a partner’s libido increases by one unit, then the person’s libido should increase by just 0.416 units * Since b is positive then partner's libido ahs pos relation with pps's libido
754
How is DF calculated for these t-tests in ANCOVA table? - (2)
N - p -1 N is total sample size, p is number of predictors (2 dummy variables and covariate )
755
What post-hoc tests can you do with ANCOVA? - (3)
* Tukey LSD with no adjustments (not reccomended) * Bonferroni correction (reccomended) * Sidak correction
756
The sidak correction is similar to what correction?
Bonferroni correction
757
Sidak correction is less conserative than
Bonferroni correction
758
The Sidak correction should be selected if you are concerned about
loss of power associated with Bonferroni corrected values.
759
What does these planned contrast results show in ANCOVA? DV = Pp's Libido, IV = Dose of Viagara, Covariate is Partner's Libido - IV Dose: Level 3 = high dose, level 2 = low dose, level 1 = placebo (3)
* Contrast 1 of comparing level 2 (low dose) against level 1 (placebo) is significant (p = 0.045) * Contrast 2 of comparing level 3 (high dose) with level 1 (placebo) is significant (p - 0.010)
760
What does this Sidak correction post-hoc comparison in ANCOVA output show? DV = Libido, IV = Dose of Viagara, Covariate is Libido - IV Dose: Level 3 = high dose, level 2 = low dose, level 1 = placebo - - (3)
* The significant difference between the high-dose and placebo groups remains (p = .030) * high-dose and low-dose groups do not significantly differ (p = .93) * Low dose and placebo groups do not significantly differ (p value = 0.130)
761
What do these scatterplot of regression lines show in terms of homogenity of regression slopes? DV = Libido, IV = Dose of Viagara, Covariate is Libido - IV Dose: Level 3 = high dose, level 2 = low dose, level 1 = placebo (3)
For placebo and low dose there appears to be a positive relationship between pp's libido and that of their partner However, in the high-dose condition there appears to be no relationship at all between participant’s libido and that of their partner - shows negative relationship Doubts whether homogenity of regression slopes is satisfied as not all the slopes are the same (go same direction)
762
What effect sizes can we use for ANCOVA/ANOVA? - (4)
* eta-squared * partial-eta squared (ANCOVA) * omega squared = used when equal numbe of pps in each grp * r
763
How is eta-squared calcuated?
Dividing the effect of interest SSM by total variance in the data SST
764
How is partial eta-squared calculated for ANCOVA??
SS Effect/ SS Effect + SS Residual
765
What is the difference between partial and eta-squared?
This differs from eta squared in that it looks not at the proportion of total variance that a variable explains, but at the proportion of variance that a variable explains that is not explained by other variables in the analysis
766
What test is used to investigate this question and how is it conducted? - (2) We want to know whether or not studying technique (3 levels) has an impact on exam scores, but we want to account for the grade that the student already has in the class. 
* ANCOVA * ANCOVA is conducted to determine i f there is a statistically significant difference between different studying techniques (IV) on exam score (DV) after controlling for current grade (covariate)
767
In ANCOVA, we partion the total variance into
IV, DV and covariate
768
In ANCOVA, examine influence of categorical IVs on DV while removing the effect of
covariate factor(s)
769
In ANCOVA, the covariate correlates with the ... but not the ..
correlates with outcome DV but not with IV
770
What is an example of covariate?
baseline pre-test scores can be used as a covariate to control for inital grp differences on test performance
771
In ANCOVA, the IVS, Covariates and DVs are.. - (2)
* IVs are categorical * Covariates are metric (quantiatively) independent of IV * DV is metric
772
In ANCOVA, you have - (2)
1 DV: Continous 2 predictor variables with 2 levels or more that are categorical and continous
773
What is example of continous? - (3)
infinite number of possible values variables can take on e.g., interval = equal intervals on variable represent equal difference measured like diff between 600ms and 800ms is = difference between 1300ms and 1500ms e.g., ratio = same as interval but clear definition of 0 like height or weight
774
What is example of categorical variable? - (3)
A variable that cannot take on all values within the limits of the variable - entities are divided into distinct categories e.g., nominal = 2 or more caegories e.g., whether someone is vegan or vegetarian e.g., ordinal categories have order like people who got fail, pass, merit or distinction
775
What does independence of covariate mean in ANCOVA?
Independence of the covariate and treatment effect means that the categorical predictors and the covariate should not be dependent on each other
776
What does homogenity of regression slopes mean in ANCOVA?
Homogeneity of regression slopes means that the covariate has a similar relationship with the outcome measure, irrespective of the level of the categorical variable - in this case the group
777
For homogeneity of regression slopes in ANCOVA, there are
There are alternative, a bit more advanced, methods to account for such differences as they are not, in general, uninteresting, but for the ANCOVA analysis they do present an issue
778
In ANCOVA, between subject effects we quote DF such as for dose as...
Quote df for the effect and error, e.g. 2,26
779
In ANCOVA, adjusted means table in SPSS shows.. - (2) outcome/DV = happiness measure ranging from 0 to 10 (as happy as I can image) = continous = interval The fixed factor (IV) which is dose of therapy which is people have 15 minutes of puppy therapy or 30 minutes Covariate is control group is how much they love puppies = continous = interval
The group means can be recalculated once the effect of the covariate is ‘discounted’ = impact of covariate is taken into account and adjusted into each level of predictor variable in mean column These values can differ markedly from the original group means and help with interpretation.
780
ANCOVA is extension of ANOVA as - (2)
1. Control for Covariances (continuous variables you may not necessarily want to measure) 2. Study combinations of categorical and continuous variables – covariate becomes the variable of interest rather than the one you control
781
What ANCOVA was conducted? We want to know whether or not studying technique has an impact on exam scores, but we want to account for the grade that the student already has in the class. 
A three-way ANCOVA was conducted to determine a statistically significant difference between different study techniques on students exam scores after controlling for their current grades.
782
Assumptions of ANCOVA - (8)
Independent variables should be categorical variables. The dependent variable and covariate should be continuous variables (measured on an interval scale or ratio scale.) Make sure observations are independent - don’t put people into more than one group. Normality: the dependent variable should be roughly normal for each of category of independent variables. Data (and regression slopes) should show homogeneity of variance. The covariate and dependent variable (at each level of independent variable) should be linearly related. Your data should be homoscedastic The covariate and the independent variable shouldn’t interact. In other words, there should be homogeneity of regression slopes.
783
In one-way ANOVA we partition the total variance into
IV and DV
784
A psychologist was interested in the effects of different fear information on children’s beliefs about an animal. Three groups of children were shown a picture of an animal that they had never seen before (a quoll). Then one group was told a negative story (in which the quoll is described as a vicious, disease-ridden bundle of nastiness that eats children’s brains), one group a positive story (in which the quoll is described as a harmless, docile creature who likes nothing more than to be stroked), and a final group weren’t told a story at all. After the story children rated how scared they would be if they met a quoll, on a scale ranging from 1 (not at all scared) to 5 (very scared indeed). To account for the natural anxiousness of each child, a questionnaire measure of trait anxiety was given to the children and used in the analysis what analysis has been used - Independent analysis of variance Repeated-measures analysis of variance Mixed analysis of variance Analysis of covariance
Analysis of covariance (ANCOVA)
785
A psychologist was interested in the effects of different fear information on children’s beliefs about an animal. Three groups of children were shown a picture of an animal that they had never seen before (a quoll). Then one group was told a negative story (in which the quoll is described as a vicious, disease-ridden bundle of nastiness that eats children’s brains), one group a positive story (in which the quoll is described as a harmless, docile creature who likes nothing more than to be stroked), and a final group weren’t told a story at all. After the story children rated how scared they would be if they met a quoll, on a scale ranging from 1 (not at all scared) to 5 (very scared indeed). To account for the natural anxiousness of each child, a questionnaire measure of trait anxiety was given to the children and used in the analysis what is covariate?
Natural Fear Level
786
Which of the designs below would be best suited for ANCOVA? A. Participants were randomly allocated to one of two stress management therapy groups, or a waiting list control group. Their levels of stress were measured and compared after 3 months of weekly therapy sessions. B. Participants were allocated to one of two stress management therapy groups, or a waiting list control group based on their baseline levels of stress. The researcher was interested in investigating whether stress after the therapy was successful partialling out their baseline anxiety. C. Participants were randomly allocated to one of two stress management therapy groups, or a waiting list control group. The researcher was interested in the relationship between the therapist’s ratings of improvement and stress levels over a 3-month treatment period. D.Participants were randomly allocated to one of two stress management therapy groups, or a waiting list control group. Their baseline levels of stress were measured before treatment, and again after 3 months of weekly therapy sessions. (2)
D since baseline levels of stress used as covariate and use this as a control when looking at impact treatment has had over 3 month assessment Not B since grps allocated based on baseline levels of stress (covariate and IV correlated - problematic) and A and C is one-way independent ANOVA
787
A psychologist was interested in finding a cure for hangovers. She took 50 people out on the town one night and got them drunk. The next morning, she allocated them to either a control condition (drink water only) or an experimental hangover cure condition (a beetroot, raw egg and chilli smoothie). This is the variable ‘Group’. Two hours later she then measured how well they felt on a scale from 0 ('I feel fine’) to 10 ('I am about to die')(Variable = Hangover). She also realized she ought to ask them how drunk they were the night before and control for this in the analysis, so she measured this on another scale of 0 ('sober') to 10 (‘very drunk’) (Variable = Drunk). The psychologist hypothesised that the smoothie drink would lead to participants feeling better, after having accounted for the previous night’s drunkenness. What test?
ANCOVA
788
A psychologist was interested in finding a cure for hangovers. She took 50 people out on the town one night and got them drunk. The next morning, she allocated them to either a control condition (drink water only) or an experimental hangover cure condition (a beetroot, raw egg and chilli smoothie). This is the variable ‘Group’. Two hours later she then measured how well they felt on a scale from 0 ('I feel fine’) to 10 ('I am about to die')(Variable = Hangover). She also realized she ought to ask them how drunk they were the night before and control for this in the analysis, so she measured this on another scale of 0 ('sober') to 10 (‘very drunk’) (Variable = Drunk). The psychologist hypothesised that the smoothie drink would lead to participants feeling better, after having accounted for the previous night’s drunkenness. Identify IV (fixed), DV and covariate - (3)
- IV: Group - DV: Hangover - Covariate: Drunk
789
What is the decision tree of choosing a two-way independent ANOVA? - (5)
Q: What sort of measurement? A: Continuous Q:How many predictor variables? A: Two or more Q: What type of predictor variable? A: Categorical Q: How many levels of the categorical predictor? A: Not relevant Q: Same or Different participants for each predictor level? A: Different
790
Partial eta-squared should be reported for
ANOVA and ANCOVA
791
What is the two drawbacks of eta-squared?
as you add more variables to the model, the proportion explained by any one variable will automatically decrease. 
792
How is eta-squared calculated?
Sum of squares between (squares of effect M) divided by sum of squared total (squares of everything - effects, errors and interactions)
793
In one-way ANOVA eta-squared and partial-eta squared will be eequal but not true in models with
more than one IV
794
Two-way Independent ANOVA is also called an
Independent Factorial ANOVA
795
What is a factorial design?
When experiment has two or more IVs
796
What are the 3 types of factorial design? - (3)
1. Independent factorial design 2. Repeated-measures (related) factorial design 3. Mixed design
797
What is independent factorial design?
* There is many IVs or predictors that each have been measured using different pps (between grps)
798
What is repeated-measures (related) factorial design?
* Many IVs or predictors have been measured but same pps used in all conditions
799
What is mixed design?
* Many IVs or predictors have been measured; some measured with diff pps whereas others used same pps
800
Which design does independent factorial ANOVA use?
Independent factorial design
801
What is factorial ANOVA?
When we use ANOVA to analyse a situation in which there is two or more IVs
802
What is difference between one way and two way ANOVA?
A one-way ANOVA has one independent variable, while a two-way ANOVA has two.
803
Example of two-way independent factorial ANOVA The study tested the prediction that subjective perceptions of physical attractiveness become inaccurate after drinking alcohol which is IV, DVs - What are the IVs, DVs- (3)
IV = Alcohol - 3 levels = Placebo, Low dose, High dose Iv = face type 2 levels = unattractive, attractive DV = Physical attractiveness score
804
Two way independent ANOVA can be fit into the idea of
linear model
805
The study tested the prediction that subjective perceptions of physical attractiveness become inaccurate after drinking alcohol IV = Alcohol - 3 levels = Placebo, Low dose, High dose Iv = face type 2 levels = unattractive, attractive DV = Physical attractiveness score Create a linear model for this two-way ANVOA scenario which adds interaction term and explain why is it important - (3)
* The first equation models the two predictors in a way that allows them to account for variance in the outcome separately, much like a multiple regression model * The second equation adds a term that models how the two predictor variables interact with each other to account for variance in the outcome that neither predictor can account for alone. * The interaction is important to us because it tests our hypothesis that alcohol will have a stronger effect on the ratings of unattractive than attractive faces
806
How do we know coefficients in model are significant in two-way ANCOVA?
We follow the same routine , similar to one-way ANOVA, to compute sums of squares for each factor of the model (and their interaction) and compare them to the residual sum of squares, which measures what the model cannot explain
807
How is two-way independent ANOVA similar to one-way ANOVA?
, we still find the total sum of squared errors (SST) and break this variance down into variance that can be explained by the experiment (SSM) and variance that cannot be explained (SSR).
808
How is two-way INDEPENDENT ANOVA different to one-way INDEPENDENT ANOVA? - (3)
in two-way ANOVA, the variance explained by the experiment is made up of not one experimental manipulation but two. Therefore, we break the model sum of squares down into variance explained by the first independent variable (SSA), variance explained by the second independent variable (SSB) and variance explained by the interaction of these two variables (SSA × B)
809
How to calculate total sum of squares SST in two-way independent ANOVA?
810
What is SST DF in two-way independent ANOVA?
N- 1
811
How to compute model sum of squares SSM in two-way independent ANOVA? - (2)
sum of all grps (pairing each level of IV with another) n = number of scores in each grp which is multipled by the mean value of each group subtracted by grand mean of all pps regardless of grp squared
812
How to compute degrees of freedom of SSM in two-way independent ANOVA?
(g-1)
813
How many groups are there in this research two-way independent ANOVA? IV = Alcohol - 3 levels = Placebo, Low dose, High dose Iv = face type 2 levels = unattractive, attractive DV = Physical attractiveness score
placebo + attractiveness placebo + untractiveness low dose +attractiveness low dose + unattractiveness high dose +attractiveness high dose +unattractiveness - 6 grps
814
How is SSA (face type) computed in two-way independent ANOVA? IV = Alcohol - 3 levels = Placebo, Low dose, High dose Iv = face type 2 levels = unattractive, attractive DV = Physical attractiveness score - (2)
considering only two groups at a time and add together - for first IV variable (SSA) (e.g., grps of pps rated attractive and grp of pps that rated unattractive) number of pps in that grp multiplied by mean of grp subtracted by grand mean overall of all pps squared
815
What is the degrees of freedom in SSA in TWO-WAY INDEPENDENT ANOVA?
DF = (g-1) so if male and female then 2 -1 = 1
816
How to compute SSB in two-way independent ANOVA for alcohol type IV = Alcohol - 3 levels = Placebo, Low dose, High dose Iv = face type 2 levels = unattractive, attractive DV = Physical attractiveness score - (2) - (3)
same formula as SSA but for the second IV added for all grps of pps in second IV number of pps in one grp of secondIV(mean score of that grp subtract by grand mean of all pps regardless of grp) squared
817
What is DF for SSB in two-way independent ANOVA?
number of grps in second IV minus 1
818
SS A X B in two-way independent ANOVA is calculating how much variance is explaiend
by the interaction of 2 variables
819
How is SS A X B (interaction term) calculated in two-way ANOVA?
SS A X B = SSM - SSA - SSB
820
How is SS A X B'S DF calculated in two-way independent ANOVA?
df A X B = df M - df A - df B
821
The SSR in two-way independent ANOVA, is similar to one-way ANOVA as it represents the
individual differences in performance or the variance that can’t be explained by factors that were systematically manipulated.
822
How to calculate SSR in two-way independent ANOVA?
* use individual variances of each grp (e.g., attractiveness face type + placebo) and multiply by one less than number of people within the group (n - in this case 6) and do it for each group and add it together
823
How to calculate SSR in two-way independent ANOVA?
number of grps you have in study(number of scores you have per group minus 1)
824
Diagram of calculating mean sums of squares in two-way ANOVA independent
825
What effect sizes can we calculate with two-way independent ANOVA? - (2)
* Partial eta-squared * Omega-squared if advised
826
What to do whe assumptions are violated in factorial independent ANOVA? - (3)
* There is not a simple non-parametric counterpart of factorial ANOVA * If assumption of normality is violated then use robust methods described in Wilcox's and files in R * If assumptions of homogenity of variance then implement corrections based on Welch procedure
827
Example of a research scenario of two-way independent ANCOVA Pick out IVs and DVs - (4)
- Independent samples design - Two Ivs, both 2 conditions: drug type (A, B) and onset (early, late) - One DV is cognitive performance - Two way ANOVA
828
What does this two-way ANOVA independent design SPSS output show?
- The levene’s test is not significant so assume equal variances
829
What happens if Levene's test is significant in two-way independent ANOVA?
steps taken to equalise variances through data transformation
830
What does this two-way independent ANOVA table show - (4)
- Drug : F(1,24) = 5.58, p = 0.027, partial eta-squared = 0.19 (large effect + sig effect) - Onset: F(1,24) = 14.43, p = 0.001, partial eta-squared = 0.38 (large effect + sig effect) - Interaction Drug * Onset: F(1,24) = 9.40, p = 0.005, partial eta-squared = 0.28 (large effect + sig effect) - We got two sig main effects and sig interaction effect which are all quite large effect sizes
831
What does this SPSS output show for two-way independent ANOVA? - (3)
drug B has higher score on cognitive test than A and is sig main effect (CI does not contain 0 and also main effect analysis) early onset scoring higher on average than late onset (CI does not contain 0 and also main effect analysis) Important of these main effect as main effects ignoring the effec tof other IV so results for drug at top is regardless of whether late/onset for example , does not tell anything for interaction
832
What does this interaction plot show TWO WAY ANOVA? - (6)
* Blue line is early onset * Green line is late onset * For late onset, drug B lead to higher mean scores on test than drug A * For early onset, drug A led to slightly higher mean scores than drug B * Drug A more effective then drug b for early onset but different marginal * Drug B was substantially more effective than Drug A for late
833
Non-parallel lines in interaction plot indicate an
sig interaction effect
834
We can follow interactions in two-way ANOVA with simple effects analysis which - (2)
* looks at the effect of one IV at individuals levels of other IV * Seeing whether differences margina/substantial is sig
835
The SSM in two-way independent ANOVA is broken down into three components:
variance explained by the first independent variable (SSA), variance explained by the second independent variable (SSB ) and variance explained by the interaction of these two variables (SSA × B ).
836
Example of difference of one-way ANOVA vs two-way ANOVA (independent) - (2)
* One-way ANOVA have one IV categorical variable (level of educaiton - college degree, grad degree, high school) * Two-way ANOVA , you have 2 categorical IV variables - level of education (college degree, grad degree, high school) and zodaic sign (libra, pisces)
837
In two-way independent ANOVA, you need how many DV and IV?
1 DV and 2 or more categorical predictors
838
What test is used for this scenario? A psychologist wanted to test a new type of drug treatment for ADHD called RitaloutTM. The makers of this drug claimed that it improved concentration without the side effects of the current leading brand of ADHD medication. To test this, the psychologist allocated children with ADHD to two experimental groups, one group took RitaloutTM(New drug), the other took the current leading brand of medication (Old drug) (Variable = Drug). To test the drugs’ effectiveness, concentration was measured using the Parker-Stone Concentration Scale, which ranges from 0 (low concentration) to 12 (high concentration) (Variable = Concentration). In addition, the psychologist was interested in whether the effectiveness of the drug would be affected by whether children had ‘inattentive type’ ADHD or ‘hyperactive type’ ADHD (Variable = ADHD subtype).
Two-way independent ANOVA
839
A researcher was interested in measuring the effect of 3 different anxiety medications on patients diagnosed with anxiety disorder. They measured anxiety levels before and after treatment of 3 different treatment groups plus a control group. The researchers also collected data on depression levels. Identify the IV, DV, and covariates! - and design (3)
IV = 3 different types anxiety medications and control grp DV: Anxiety levels after treatment of grps Covariate = anxiety before treatment, depression levels ANCOVA
840
Researchers wanted to see how much people of different education levels are interested in politics. They also believed that there might be an effect of gender. They measured political interest with a questionnaire in males and females that had either school, college or university education. Identify the IVs and DV and design - (3)
* IV: Level of education - school, college or uni edu and gender (m, f) * DV: Political interest in questionnaire * Two-way independent ANOVA
841
An experiment was done to look at whether there is an effect of both gender and the number of hours spent practising a musical instrument on the level of musical ability. A sample of 30 participants (15 men and 15 women) who had never learnt to play a musical instrument before were recruited. Participants were randomly allocated to one of three groups that varied in the number of hours they would spend practising every day for 1 year (0 hours, 1 hours, 2 hours). Men and women were divided equally across groups. All participants had a one-hour lesson each week over the course of the year, after which their level of musical skill was measured on a 10-point scale ranging from 0 (you can’t play for toffee) to 10 (‘Are you Mozart reincarnated?’). Identify IVs and DV and design - (3)
* IV: Gender (m,f) , number of hrs spent practicisng * DV: Level of muscial skill after a year * Two-way independent ANOVA, not t-tests since more than one IV
842
In these outputs is there a effect of gender, education or interaction level TWO WAY ANOVA INDEPENDENT
* Is there an effect of gender overall? No, F(1,54) = 1.63, p = .207 Is there an effect of education level? Yes, F(2,54) = 147.52, p < .001 Is there an interaction effect? Yes, F(2,54) = 4.64, p = .014
843
How to interpret these findings?
* Main effect of Aspirin: Aspirin reduces heart attackes compard to placebo (1) * Main effect of carotene: Beta carotene reduces heart attack (2) * Interaction effect: Yes, bigger effect when aspirin and beta carotene taken together (3) - also lines drawn more its an interaction
844
WHICH STATEMENT BEST DESCRIBES A COVARIATE? A variable that is not able to be measured directly. A variable that shares some of the variance of another variable in which the researcher is interested. A pair of variables that share exactly the same amount of variance of another variable in which the researcher is interested. A variable that correlates highly with the dependent variable.
A variable that shares some of the variance of another variable in which the researcher is interested.
845
TWO-WAY ANOVA IS BASICALLY THE SAME AS ONE-WAY ANOVA, EXCEPT THAT: The model sum of squares is partitioned into two parts The residual sum of squares represents individual differences in performance The model sum of squares is partitioned into three parts We calculate the model sum of squares by looking at the difference between each group mean and the overall mean
C. The model sum of squares is partitioned into three parts The model sum of squares is partitioned into the effect of each of the independent variables and the effect of how these variables interact (see Section 13.2.7) D is also true, but we also do this for both one-way and two-way ANOVA (see Section 13.2.7).
846
IF WE WERE TO RUN A FOUR-WAY BETWEEN-GROUPS ANOVA, HOW MANY SOURCES OF VARIANCE WOULD THERE BE? 4 16 12 15
16 because 4*4 = 16 (if it was 3x2 then would be 6)
847
Which of the following sentences best describes a covariate? A. A variable that shares some of the variance of another variable in which the researcher is interested. B. A variable that correlates highly with the dependent variable C. A variable that is not able to be measured directly D. A pair of variables that share exactly the same amount of variance of another variable in which the researcher is interested
A
848
An experiment was done to look at whether there is an effect of both gender and the number of hours spent practising a musical instrument on the level of musical ability. A sample of 30 participants (15 men and 15 women) who had never learnt to play a musical instrument before were recruited. Participants were randomly allocated to one of three groups that varied in the number of hours they would spend practising every day for 1 year (0 hours, 1 hours, 2 hours). Men and women were divided equally across groups. All participants had a one-hour lesson each week over the course of the year, after which their level of musical skill was measured on a 10-point scale ranging from 0 (you can’t play for toffee) to 10 (‘Are you Mozart reincarnated?’). A. Two-way independent ANOVA B. Two-way repeated ANOVA C. Three way ANOVA = only 2 IVs so no D. T-test
A
849
Which of the designs below would be best suited to ANCOVA? A. Participants were randomly allocated to one of twostress management therapy groups, or a waiting listcontrol group. Their baseline levels of stress weremeasured before treatment, and again after 3months of weekly therapy sessions B. Participants were randomly allocated to one of twostress management therapy groups, or a waiting listcontrol group. Their levels of stress were measuredand compared after 3 months of weekly therapysessions. C. Participants were randomly allocated to one of twostress management therapy groups, or a waiting listcontrol group. The researcher was interested in therelationship between the therapist’s ratings ofimprovement and stress levels over a 3-monthtreatment period. D. Participants were allocated to one of two stressmanagement therapy groups, or a waiting listcontrol group based on their baseline levels ofstress. The researcher was interested ininvestigating whether stress after the therapy wassuccessful partialling out their baseline anxiety
A - baseline levels of stress used as covariate . We can use the baseline, pre-treatment measures as a control when looking at the impact the treatment has on the 3-month assessment.
850
A music teacher had noticed that some students went to pieces during exams. He wanted to testwhether this performance anxiety was different for people playing different instruments. He tookgroups of guitarists, drummers and pianists (variable = ‘Instru’) and measured their anxiety(variable = ‘Anxiety’) during the exam. He also noted the type of exam they were performing (inthe UK, musical instrument exams are known as ‘grades’ and range from 1 to 8). He wanted tosee whether the type of instrument played affected performance anxiety when accounting for thegrade of the exam. Which of the following statements best reflects what the effect of ‘Instru’ in theoutput table below tells us? (Hint: ANCOVA looks at the relationship between an independent and dependent variable, takinginto account the effect of a covariate. A. The type of instrument played in the exam had asignificant effect on the level of anxietyexperienced, even after the effect of the grade ofthe exam had been accounted for B. The type of instrument played in the exam had asignificant effect on the level of anxietyexperienced C. The type of instrument played in the exam did nothave a significant effect on the level of anxietyexperienced
A
851
Question 5A psychologist was interested in the effects of different fear information on children’s beliefs aboutan animal. Three groups of children were shown a picture of an animal that they had never seenbefore (a quoll). Then one group was told a negative story (in which the quoll is described as avicious, disease-ridden bundle of nastiness that eats children’s brains), one group a positive story(in which the quoll is described as a harmless, docile creature who likes nothing more than to bestroked), and a final group weren’t told a story at all. After the story children rated how scared theywould be if they met a quoll, on a scale ranging from 1 (not at all scared) to 5 (very scaredindeed). To account for the natural anxiousness of each child, a questionnaire measure of traitanxiety was given to the children and used in the analysis. The SPSS output is below. Whatanalysis has been used? (Hint: The analysis is looking at the effects of fear information on children’s beliefs about an animal, taking into account children’s natural fear levels.) A. ANCOVA B. Independent analysis of variance C. Repeated measures analysis of variance
A
852
Imagine we wanted to investigate the effects of three different conflict styles (avoiding, compromising and competing) on relationship satisfaction, but we discover that relationship satisfaction is known to covary with self-esteem. Which of the following questions would be appropriate for this analysis? A. What would the mean relationship satisfaction be for the three conflict style groups, if their levels of self-esteem were held constant? B. What would the mean relationship satisfaction be if levels of self-esteem were held constant? C. What would the mean self-esteem score be for the three groups if their levels of relationship satisfaction were held constant? D. Does relationship satisfaction have a significant effect on the relationship between conflict style and self-esteem?
A
853
A study was conducted to look at whether caffeine improves productivity at work in different conditions. There were two independent variables. The first independent variable was email, which had two levels: ‘email access’ and ‘no email access’. The second independent variable was caffeine, which also had two levels: ‘caffeinated drink’ and ‘decaffeinated drink’. Different participants took part in each condition. Productivity was recorded at the end of the day on a scale of 0 (I may as well have stayed in bed) to 20 (wow! I got enough work done today to last all year). Looking at the group means in the table below, which of the following statements best describes the data? A. A significant interaction effect is likely to be present between caffeine consumption and email access. B. There is likely to be a significant main effect of caffeine. C. The effect of email is relatively unaffected by whether the drink was caffeinated. D. The effect of caffeine is about the same regardless of whether the person had email access.
A = for decaffeinated drinks there is little difference between email and no email, but for caffeinated drinks there is
854
What are the two main reasons for including covariates in ANOVA? A. 1. To reduce within-group error variance 2. Elimination of confounds B. 1. To increase within-group error variance 2. To reduce between-group error variance C. 1. To increase within-group error variance 2. To correct the means for the covariate D. 1. To increase between-group variance 2. To reduce within-group error variance
A
855
A psychologist was interested in the effects of different fear information on children’s beliefs about an animal. Three groups of children were shown a picture of an animal that they had never seen before (a quoll). Then one group was told a negative story (in which the quoll is described as a vicious, disease-ridden bundle of nastiness that eats children’s brains), one group a positive story (in which the quoll is described as a harmless, docile creature who likes nothing more than to be stroked), and a final group weren’t told a story at all. After the story children rated how scared they would be if they met a quoll, on a scale ranging from 1 (not at all scared) to 5 (very scared indeed). To account for the natural anxiousness of each child, a questionnaire measure of trait anxiety was given to the children and used in the analysis. Which of the following statements best reflects what the ‘pairwise comparisons’ tell us? A. Fear beliefs were significantly higher after negative information compared to positive information and no information, and fear beliefs were not significantly different after positive information compared to no information. B. Fear beliefs were significantly lower after positive information compared to negative information and no information; fear beliefs were not significantly different after negative information compared to no information. C. Fear beliefs were significantly higher after negative information compared to positive information; fear beliefs were significantly lower after positive information compared to no information. D. Fear beliefs were all about the same after different types of information.
A
856
a musical instrument and gender on the level of musical ability. A sample of 30 (15 men and 15 women) participants who had never learnt to play a musical instrument before were recruited. Participants were randomly allocated to one of three groups that varied in the number of hours they would spend practising every day for 1 year (0 hours, 1 hours, 2 hours). Men and women were divided equally across groups. All participants had a one-hour lesson each week over the course of the year, after which their level of musical skill was measured on a 10-point scale ranging from 0 (you can’t play for toffee) to 10 (‘Are you Mozart reincarnated?’). An ANOVA was conducted on the data from the experiment. Which of the following sentences best describes the pattern of results shown in the graph? A. The graph shows that the relationship between musical skill and time spent practising was different for men and women. B. The graph shows that the relationship between musical skill and time spent practising was the same for men and women. C. The graph indicates that men and women were most musically skilled when they practised for 2 hours per day. D. Women were more musically skilled than men.
A
857
What is the decision tree for choosing one-way repeated measures ANOVA? - (5)
Q: What sort of measurement? A: Continuous Q:How many predictor variables? ONE IV Q: What type of predictor variable? A: Categorical Q: How many levels of the categorical predictor? More than two Q: Same or Different participants for each predictor level? A: Same
858
The assumption of sphericity in within-subject design ANOVA can be likened to
the assumption of homogeneity of variance in between-group ANOVA
859
Sphericity is sometimes denoted as IN REPEATED ANOVA
ε or circularity
860
What does spherecity refer to repeated anova?
equality of variances of the differences between treatment levels.
861
you need at least ... conditions for spherecity to be an issue in repeated ANOVA
three
862
How is sphereicty assumed in this datasetA (USED IN REPEATED ANOVA
863
How is spherecity calculated? - (2) REPEATED ANOVA
* Calculating differences between between pairs of scores for all treatment levels e.g., A-B, A-C , B-C * Calculating variances of these differences e.g., variances of A-B, A-C, B-C
864
What does the data from table show in terms of assumption of spherecity (calculated by hand) REPEATED ANOVA? - (3)
there is some deviation from sphericity because the variance of the differences between conditions A and B (15.7) is greater than the variance of the differences between A and C (10.3) and between B and C (10.3). However, these data have local circularity (or local sphericity) because two of the variances of differences are identical. The deviation from spherecity in the data does not seem too severe (all variances roughly equal) but here assess deviation is serve to warrant an action
865
How to assess the assumption of sphereicity in SPSS REPEATED ANOVA?
via Mauchly's test
866
If Mauchly's test statistic is significant (p < 0.05) then REPEATED ANOVA
variance of differences between conditions are significnatly different - must be vary of F-ratios produced by computer
867
If Mauchly's test statistisc is non significant (p > 0.05) then it is reasonable to conclude that the REPEATED ANOVA
varainces of the differences between conditions are equal and does not significantly differ
868
Signifiance of Mauchly's test REPEATED ANOVA is dependent on
sample size
869
Example of signifiance of Maulchy's test dependent on sample size REPEATED ANOVA - (2)
in big samples small deviations from sphericity can be significant, small samples large violations can be non-significant
870
What happens if the data violates the sphereicity assumption REPEAED ANOVA? - (2)
several corrections that can be applied to produce a valid F-ratio or use multivariate test statistics (MANOVA)
871
What corrections to apply to produce valid F-ratio when data violates sphereicity REPEATED ANOVA? - (2)
* Greenhouse-Geisser correction ε ^ * Huynh-Feldt correction
872
The Greenhouse-Geisser correction varies between REPEATED ANOVA
1/k-1 (k is number of repeated measures conditions) and 1
873
The closer that Greenhouse Geisser correction is to 1, the REPEATED ANOVA
more homogeneous the variances of differences, and hence the closer the data are to being spherical.
874
How to calculate lower-bound estimate fo spherecity for Greenhouse-Geisser correction when there is 5 conditions REPEATED ANOVA? - (2)
Limit of f ε^ is 1/k (number of repeated-measures conditions) so... 1/(5-1) = 1/4 = 0.25
875
Huynh and Feldt (1976) reported that when the
Greenhouse-Geisser correction is too conservative
876
Huynh-Feldt correction is less conservative than
Greenhouse-Geisser correction
877
Why is MANOVA used when data that violates spherecity IN REEATED ANOVA?
MANOVA is not dependent upon the assumption of sphericity
878
In repeated measures ANOVA, the effect of our experiment is shown up in within participant variance than
between group variance
879
In independent ANOVA, the within-group variance is our.... and it is not contaimed by... - (2)
residual variance (SSR) = variance produced by individual differences in performance SSR is not contaimined by experimental effect as study carried out by different people
880
In repeated-measures ANOVA, the within-participant variability is made up of
the effect of experimental manipulation SSM and individual differences in performance (random factors outside of our control) - this is error SSR
881
Similar to independent ANOVA, repeated-measures ANOVA uses F-ratio to - (2)
compares the size of the variation due to our experimental manipulations to the size of the variation due to random factors has same type of variances in independent - total sum of squares (SST), model sum of squares (SSM) and a residual sum of squares (SSR)
882
What is the differences between independent ANOVA and repeated-measures design ANOVA?
repeated-measures ANOVA the model and residual sums of squares are both part of the within-participant variance.
883
In repeated-measures ANOVA If the variance due to our manipulations is big relative to the variation due to random factors, we get a .. and conclude - (2)
big value of F ratio we can conclude that the observed results are unlikely to have occurred if there was no effect in the population.
884
To compute F-ratios we first compute the sum of squares which is the following REPEATED ANOVA... - (5)
* SST * SSB * SSW * SSM * SSR
885
How is SST calculated in one-way repeated measures ANOVA? REPEATED ANOVA
SST = grand variance (N-1)
886
What is the DF's of SST? REPEATD ANOVA
N-1
887
The SSW (within-participant) sum of squares is calculated in one-way repeated ANOVA by...
square of the standard deviation of each participant’s scores multiplied by the number of conditions minus 1, summed over all participants.
888
What is the DF of SSW of one-way repeated ANOVA? - (2)
DF = N(n-1) number of participants multiplied by the number of conditions minus 1;
889
How is SSM calculated in one-way repeated ANOVA? - (2)
square of the differences between the mean of the participant scores for each condition and the grand mean multiplied by the number of participants tested, summed over all conditions. do this for each condition grp
890
What is the DF of SSM in one-way repeated ANOVA? - (2)
DF = n-1 n is number of conditions
891
How is SSR calculated in one-way repeated ANOVA?
the difference between the within-participant sum of squares and the sum of squares for the model.
892
What is the DF for SSR in one-way repeated ANOVA?
DF of SSW minus DF of SSM
893
How do we calculated mean squares (MS) and mean residuals (MR) to acalculate F-ratio in one-way repeated ANOVA?
894
We don't need to use SSB (between-subject variation )to calculate F-ratio in
one-way repeated ANOVA
895
What does SSB represent in one-way ANOVA?
individual differences between cases
896
Not only does sphereicity produces problems for F in repeated measures ANOVA but causes complications for
post-hoc tests
897
When spereicity is violated in one-way repeated ANOVA , what post-hoc test to use and why - (2)
Bonferroni method seems to be generally the most robust of the univariate techniques, especially in terms of power and control of the Type I error rate.
898
When sphereicity is not violated in one-way repeated ANOVA, then what post-hoc tests to use?
Tukey can be used
899
In either case where sphereicity is violated or not in one-way repeated ANOVA, a post-hoc test called - (2)
Games–Howell procedure, which uses a pooled error term, it is more preferable to Tukey’s test.
900
Due to complications of sphereicity in one-way repeated ANOVA,
standard post hoc tests used for independent designs not avaliable for repeated measure designs
901
Why is repeated contrast useful in repeated-measures design especially one-way repeated measures?
levels of the independent variable have a meaningful order e.g., meausred DV at successive time points or adminstered increasing doses of a drug
902
When should Sidack correction as post hoc be selected for one-way repeated ANOVA?
concerned about the loss of power associated with Bonferroni corrected values.
903
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA what does these SPSS outputs show? - (2)
* Left shows variables represent each level of IV which is animal * Right shows descriptive statistics - higher mean time to retch when celebrity eating stick insect (8.12)
904
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - ONE WAY repeated anova What does this Mauchly's Test of Spherecity show? - (2)
* P-value is 0.047 which is less than 0.05 * Thus, reject the assumption of spherecity that variances of the differences between levels are equal
905
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) What to do if this Mauchly's Test of Spherecity shows assumption of sphereicity is violated..? - (3) one way repeated ANOVA
* Since there are 4 conditions, lower limit of ε^ is 1/(4-1) = 0.333 (lower-bound estimate in table) * SPSS Output 13.2 shows that the calculated value of ε ^ is 0.533. * 0.533 is closer to the lower limit of 0.33 than it is to the upper limit of 1 and it therefore represents a substantial deviation from sphericity
906
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA What does this main ANOVA table show in terms of spherecity assumed? - (2)
- The value of F = 3.97 which is compared against a critical value for 3 and 21 DF and p-value is 0.026 - conclude there is significant difference between 4 animals in their capacity to induce retching when eaten
907
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA What has changed and kept the same in the table? - (2)
* The F-ratios are the same across the rows * the D.F is changed as well as critical value the F-statistic is compared with
908
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA How is adjustments made to DF?
* Adjustment made by multiplying the DF by the estimate of spherecity.
909
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA What does the results show in terms of Greenhouse-Geisser correction and Huynh-Fedt..? - (3)
* Observed F statistic not significant using Greenhouse-Geisser ( p> 0.05) * Greenhouse-Geisser is quite conservative and miss true effects that exist * Thus, Huynh-Feldt showend F-statistic is still significant as p-value of 0.048
910
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA What happens if Greenhouse Geisser is not-significant (p>0.05) and Huynh-Feldt is significant in this example? - (2)
* Taking average of two significant values e.g., 0.063+ 0.048/2 = 0.056 * Thus, go with Greenhouse-Geisser correction and conclude F ratio is non-significant
911
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA What happens if two corrections - Greenhouse and Felt give same conclusion then you can choose which one to
report
912
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA Important to use valid critical value of F - choosing which p-value to report as it potentially makes a difference between making a
Type 1 error (False positive) or not
913
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA what does this summary table of repeated contrasts show? - (3) Level 1 vs 2 is stick insect vs kangaroo testicle Level 2 vs 3 is kangaroo testicle vs fish eyeball Level 3 vs 4 is fish eyeball vs witchetty grub
* celebrities took significantly longer to retch after eating the stick insect compared to the kangaroo testicle (Level 1 vs. Level 2) - p-value of 0.002 * Time taken to retch was not significantly different in Level 2 vs 3 and Level 3 vs 4
914
Researcher measures mean time taken for celebrities to retch for each animal (sticky insect, kangaroo testicle, fish eye, witchey grub) - one-way repeated ANOVA If main effect is not significant in main ANOVA table for this data then significant contrasts in table below should be ... but if MANOVA was significant then... - (2)
ignored inclined to conclude main effects of animal was significant and proceed with further tetss like contrasts
915
What IV, DV , design and test to use for this research scenario? - (4)
- Repeated measures design - One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4) - One DV (Sales Generated) - One-way repeated ANOVA
916
What does LSD correction (post-hoc option in SPSS)
does not actually make any adjustments to p-value in terms of critical value as what post-hoc test should do
917
What does output show? - (3) - Repeated measures design - One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4) - One DV (Sales Generated) - One-way repeated ANOVA
* sales are increasing across the weeks * Week 1 start at 427.93 and gradually rise by week 4 to 642,28 pounds * looks like incentives are having an effect and seem to generate higher sales
918
What does this output show in terms of Maulchys Test of Spherecity? - (2) - Repeated measures design - One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4) - One DV (Sales Generated) - One-way repeated ANOVA
* P-value is not significant ( p = 0.080) * Assumption of spherecity is satisfied so we got equal variances between differences across conditions
919
If Maulchy's test of spherecity is not significant in one-way repeated ANOVA, then which line do we use in main ANOVA table?
920
If Maulchy's test of spherecity is significant in one-way repeated ANOVA, then which line do we use in main ANOVA table?
921
What does this main ANOVA table show? - (3) - Repeated measures design - One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4) - One DV (Sales Generated) - One-way repeated ANOVA
* DF for week is 3 and 57 (spherecity assumed from week and error) * Week: F(3,57) = 26.30, p < 0.001 (p = 0.000), eta-squared is 0.58 - large effect - - There is an overall effect going on and change across weeks
922
What does this Sidmak correction table and table of means show you in this output? - (6) - Repeated measures design - One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4) - One DV (Sales Generated) - One-way repeated ANOVA
* No sig difference betwen W1 and W2 * Sig difference between W1 and W3 = ihigher sales in W3 (538.570) compared to W1 (427.933) * Sig difference between W1 and W4 = ihigher sales in W3 (642.284) compared to W1 (427.933) *Not sig diff with W2 and W3 * Sig difference between W2 and W4 , higher sales in W4 (642.284) than W2 (481.388) * Sig difference between W3 and W4 , higher sales in W4 (642.284) than W3 (538.570)
923
What does this output show in terms of repeated contrasts? - (3) - Repeated measures design - One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4) - One DV (Sales Generated) - One-way repeated ANOVA
* - Did sales increase from W1 to W2? = p = 0.010 significant - Did sales increase from W2 to W3? = p = 0.030 - Did sales increase from W3 to W4? = p = 0.008
924
What happens if post hoc and contrasts are telling a different story? - contrasts says weekly increase e.g. W1 to W2 increase, W2 to W3 increase , W3 to W4 increase but post-hoc W1 to W3 was increased sig, W1 to W4 was sig increase but W2 to W3 was not - (2) - Repeated measures design - One IV (Incentive) , four conditions (week 1, week 2, week 3, week 4) - One DV (Sales Generated) - One-way repeated ANOVA
* Post hoc has lack of power due to many multiple comparisons * By limiting comparisons in contradt we get around problem
925
Diagram of writing up one-way repeated ANOVA
926
Two-way repeated ANOVA involves
more than one IV
927
What does four-way ANOVA mean?
4 different IV
928
What does 2x3 ANOVA means? - (2)
* IV with 3 levels * IV with 2 levels
929
What design, IV, DV and test would you to to investigate the follow scenario? - (4)
* Repeated measures design * Two IVs: alcohol (3 conditions) and sleep (2 conditions) * DV: Reaction Times * Two-way repeated measures ANOVA
930
What does this two-way repeated ANOVA SPSS output show? - (2) * Repeated measures design * Two IVs: alcohol (3 conditions) and sleep (2 conditions) * DV: Reaction Times * Two-way repeated measures ANOVA
* large number for RT means slower RT * Alcohol seem to have an effect on RT but particularly for 2 pints + no sleep
931
What does this two-way repeated ANOVA SPSS output show for Mauchly's Test of Sphericity? - (2) * Repeated measures design * Two IVs: alcohol (3 conditions) and sleep (2 conditions) * DV: Reaction Times * Two-way repeated measures ANOVA
* Two p-values: alcohol ( p = 0.00) and alcohol * sleep [ interaction effect] (p = 0.00) -- > sig so assumption of spherecity is violated so report Grenhouse-Geisser values from main ANOVA table * No p-value for sleep as only 2 conditions and test of sphericity need more than 2
932
What does this two-way repeated ANOVA main table show? - (3) * Repeated measures design * Two IVs: alcohol (3 conditions) and sleep (2 conditions) * DV: Reaction Times * Two-way repeated measures ANOVA * Error DF was 38. * Test of Spherecity was sig --> assumption violated
* Main sig effect of alcohol: F(1.16,22.06) = 51.38, p < 0.001, partial eta-squared = 0.73 * Main sig effect of sleep: F(1,19) = 88.61, p < 0.001, partial-eta-squared = 0.82 * Interaction effect: F(1.15,21.91) = 23.36, p < 0.001, partial-eta squared = 0.55
933
What does this two-way repeated ANOVA output show in post hocs? - Sidmak correction - (4) * Repeated measures design * Two IVs: alcohol (3 conditions) and sleep (2 conditions) * DV: Reaction Times * Two-way repeated measures ANOVA
* condition 1 and condition 2 which was significant * Condition 1 vs 3 which was significant * Condition 2 with Condition 3 was significant * So all groups differing significantly from each other so interpret from that higher does of alcohol has more impact on RT
934
What does this two-way repeated ANOVA interaction plot show? - (3) * Repeated measures design * Two IVs: alcohol (3 conditions) and sleep (2 conditions) * DV: Reaction Times * Two-way repeated measures ANOVA
* Interaction effect is there = as line continue they cross * Most pronouned effect was in alcohol grp 3 (2 pints) * When alcohol grp 3 had full nights sleep (2), impairs their RT very slightly * When alcohol grp 3 had sleep deprivation (1) in combination with 2 pints, it impairs RT by a lot --> use simple effect analysis as well as two-way independent ANVOA to see if difference in grp 3 of blue and green line is sig
935
What happens when assumptions vilated in repeated-measures ANOVA? - (2)
Can do non-parametric test called Friedman's ANOVA if only one IV There is no non-parametric counterpart for more than one IV in repeated design
936
Assumption of repeated measures ANOVA - (3)
1. Normal distribution 2. Repeated measures design (same participants) 3. Sphereicity - Mauchly's test
937
What does significant Mauchly's test signify in repeated measures? - (2)
A significant effect means that corrections need to be made later on Those corrections are listed in the main ANOVA output table
938
What is decision tree for two-way repeated ANOVA
1 DV continous and 2 or more categorical predictors with 2 or more levels with same participants in each predictor level
939
What is decision tree for one-way repeated ANOVA? - (3)
1 DV continous 1 Predictor categorical with more than 2 levels Same participants in each predictor level
940
Just like independent measure designs there can be more than one categorical predictor. When all participants take part in all combinations of those predictors, we have a repeated measures factorial design and can use an ANOVA to test for
significant main effects and interactions
941
Example of two-way repeated ANOVA - (3)
The variables are the type of drink (Beer - Wine - Water) and the type of imagery used in the advertisement (positive - negative - neutral) The outcome is how much the participant likes the beverage on a scale from -100 (dislike very much) to 100 (like very much) Participants went two conditions
942
Equation of variance
943
What is mixed design? - (2)
A mixture of between-subject and within-subject Several independent variables or predictors have been measured; some have been measured with different entities, (pps) whereas others used the same entities (pps)
944
You will need at least two IVs for
mixed design
945
What is decision tree for mixed design ANOVA? - (7)
Q: What sort of measurement? A: Continuous Q:How many predictor variables? A: Two or more Q: What type of predictor variable? A: Categorical Q: How many levels of the categorical predictor? A: Not relevant Q: Same or Different participants for each predictor level? A: Both This leads us to and Factorial mixed ANOVA
946
Example of mixed design scenario for ANOVA - (2)
a mixed ANOVA is often used in studies where you have measured a dependent variable (e.g., "back pain" or "salary") over two or more time points or when all subjects have undergone two or more conditions (i.e., where "time" or "conditions" are your "within-subjects" factor), but also measure DV when your subjects have been assigned into two or more separate groups (e.g., based on some characteristic, such as subjects' "gender" or "educational level", or when they have undergone different interventions). These groups form your "between-subjects" factor.
947
An organizational psychologist is hired as a consultant by a person planning to open a coffee house for college students. The coffee house owner wants to know if her customers will drink more coffee depending on the ambience of the coffee house. To test this, the psychologist sets up three similar rooms, each with its own theme (Tropical; Old Library; or New York Café ) then arranges to have thirty students spend an afternoon in each room while being allowed to drink all the coffee they like. (The order in which they sit in the rooms is counterbalanced.) The amount each participant drinks is recorded for each of the three themes. 1. Independent variable(s) 2. Is there more than 1 IV? 3. The levels the independent variable(s) 4. Dependent variable 5. Between (BS) or within-subjects (WS)? 6. What type of design is being used?
Theme No Tropical, Old Library, New York Café Amount of coffee consumed Within-subjects 1-way Repeated measures
948
A manager at a retail store in the mall wants to increase profit. The manager wants to see if the store’s layout (one main circular path vs. a grid system of paths) influences how much money is spent depending on whether there is a sale. The belief is that when there is a sale customers like a grid layout, while customers prefer a circular layout when there is no sale. Over two days the manager alternates the store layout, and has the same group of customers come each day. Based on random assignment, half of the customers told there is a sale (20 % will be taken off the final purchases), while the other half is told there is no sale. At the end of each day, the manager calculates the profit. 1. Independent variable(s) 2. Is there more than 1 IV? 3. The levels the independent variable(s) 4. Dependent variable 5. Between (BS) or within-subjects (WS)? 6. What type of design is being used?
Sale/ No Sale, Store’s layout Yes Sale-No Sale, Grid-Circular Profit BS (Sale) and WS (Layout) 2-way mixed Measures
949
A researcher at a drug treatment center wanted to determine the best combination of treatments that would lead to more substance free days. This researcher believed there were two key factors in helping drug addiction: type of treatment and type of counseling. The researcher was interested in either residential or outpatient treatment programs; and either cognitive-behavioral, psychodynamic, or client-centered counseling approaches. As new clients enrolled at the center they were randomly assigned to one of six experimental groups. After 3 months of treatment, each client’s symptoms were measured. 1. Independent variable(s) 2. Is there more than 1 IV? 3. The levels the independent variable(s) 4. Dependent variable 5. Between (BS) or within-subjects (WS)? 6. What type of design is being used?
Type of treatment, Type of counseling. Yes Residential or outpatient/ cognitive-behavioural, psychodynamic or client-centered. Substance-free days Between subjects 2-way independent measures ANOVA.
950
Assumptionsof mixed ANOVA - (3)
Normal Distribution Independent and Repeated Factors Homogeneity of Variance for the Independent factor + Sphericity for the Repeated factor
951
Assumptions of repeated-measures ANOVA -(3)
Normal Distribution Repeated Measure Design (same participants) Sphericity (Mauchly’s Test)
952
Assumptions of independent ANOVA - (3)
Normal Distribution Independence of Scores Homogeneity of Variance (Levene’s Test)
953
Leven's test tests if the variances in independent groups are similar, would levene's test be significant in this case?
Levene’s test would likely be significant as the variance between the two groups are quite different.
954
Sphereicity is an assumption of both
repeated and mixed models
955
If p-value significant for checking for spherecity ten - (3)
If GG < 0.75 THEN USE GG IF GG > 0.75 THEN USE HF Since GG less than 0.75 report adjusted F, DF and sig which is F(1.24, 21.00) = 212.32 , p < 0.001
956
Homogenity of variance is
distribution of groups are similar?
957
Spherecity is asking are the disttibution of differences between groups are
similar
958
The researcher hypothesized that there would be an interaction between dog breed (Collie or German Shepherd) and week of obedience school training (all dogs measured at 1 week and 5 weeks) as they relate to the number of times the dog growls per week. Specifically, it was hypothesized that Collies would show no difference in growls between 1 week and 5 weeks, but German Shepherds would growl less at 5 weeks than at 1 week. 1. Independent variable(s) 2. Is there more than 1 IV? 3. The levels the independent variable(s) 4. Dependent variable 5. Between(BS) or within-subjects (WS)? 6. What type of design is being used?
1. Dog breed and measurement time 2. Yes 3. Collie-German Shepard/Week 1-Week 5 4. Number of growls 5. Dog breed Between and measurement time is within 6. 2-WAY mixed ANOVA
959
What does this 2-way mixed ANOVA show? - (3) 1. Independent variable(s) 2. Is there more than 1 IV? 3. The levels the independent variable(s) 4. Dependent variable 5. Between(BS) or within-subjects (WS)? 6. What type of design is being used?
1) is there an effect overall = Yes (green) 2) Is the effect bread = Yes (red) 3) Is there an interaction = Yes (blue)
960
Partioning of variance of one way vs two way independent
961
Rules of contrast coding - (5)
Rule 1: Groups coded with positive weights compared to groups coded with negative weights. Rule 2: The sum of weights for a comparison should be zero. Rule 3: For a given contrast, the weights assigned to the group(s) in one chunk of variation should be equal to the number of groups in the opposite chunk of variation. Rule 4: If a group is not involved in a comparison, assign it a weight of zero Rule 5: If a group is singled out in a comparison, then that group should not be used in any subsequent contrasts.
962
Contrast coding example SPSS how to read
963
When conducting a Repeated-Measures ANOVA, which of the following assumptions is NOT relevant? A.Independent residuals B.Homogeneity of variance C.Sphericity D.They are all relevant
B
964
One advantage of repeated measures designs over independent designs is that we are able to calculate a degree of error for each effect, whereas in an independent design we are able to calculate only one degree of error: true or false? True or False
True
965
An experiment was conducted to see how people with eating disorders differ in their need to exert control in different domains. Participants were classified as not having an eating disorder (control), as having anorexia nervosa (anorexic), or as having bulimia nervosa (bulimic). Each participant underwent an experiment that indicated how much they felt the need to exert control in three domains: eating, friendships and the physical world (this final category was a control domain in which the need to have control over things like gravity or the weather was assessed). So all participants gave three responses in the form of a mean reaction time; a low reaction time meant that the person did feel the need to exert control in that domain. The variables have been labelled as group (control, anorexic, or bulimic) and domain (food, friends, or physical laws). Of the following options, which analysis should be conducted? A. Analysis of covariance B. Two-way repeated measures ANOVA C. Two-way mixed ANOVA D. Three-way independent ANOVA
C Two IVs = Group (Control, Anroexic, Bullimic) and Domain (Food, Friends, Physical Laws) Group is between Each partiicpant underwent domains so within DV = Participannts measured
966
An experiment was done to compare the effect of having a conversation via a hands-free mobile phone, having a conversation with an in-car passenger, and no distraction (baseline) on driving accuracy. Twenty participants from two different age groups (18–25 years and 26–40 years) took part. All participants in both age groups took part in all three conditions of the experiment (in counterbalanced order), and their driving accuracy was measured by a layperson who remained unaware of the experimental hypothesis. How do we interpret the main effect of distraction from the SPSS table (next slide)? - (2)
The assumption of sphericity has been met, indicated by Mauchly’s test (p > .05). There was a significant main effect of distraction (F(2, 36) = 45.95, p < .001). This effect tells us that if we ignore the effect of age, driving accuracy was significantly different in at least two of the distraction groups.
967
Two-way repeated-measures ANOVA compares: A. Several means when there are two independent variables, and the same entities have been used in all conditions B. Two means when there are more than two independent variables, and the same entities have been used in all conditions. C. Several means when there are two independent variables, and the same entities have been used in some of the conditions. D. Several means when there are more than two independent variables, and some have been manipulated using the same entities and others have used different entities.
A
968
When conducting a repeated-measures ANOVA which of the following assumptions is not relevant? A. Homogeneity of variance B. Sphericity C. Independent residuals D. They are all relevant
A
969
The table shows hypothetical data from 3 conditions For these data, spherecity will hold when (Hint: Sphericity refers to the equality of variances of the differences between treatment levels.) A.The variances of the differences between treatment levels are roughly equal B. The variance of each condition is roughly equal C. The variance of each condition is not equal D. The variances of the differences between treatment levels are not equal
A
970
Imagine we were interested in the effect of supporters singing on the number of goals scored by soccer teams. We took 10 groups of supporters of 10 different soccer teams and asked them to attend three home games, one at which they were instructed to sing in support of their team (e.g., ‘Come on, you Reds!’), one at which they were instructed to sing negative songs towards the opposition (e.g., ‘You’re getting sacked in the morning!’) and one at which they were instructed to sit quietly. The order of chanting was counterbalanced across groups. Looking at the output below, which of the following sentences is correct?# A.The results showed that the number of goals scored was significantly affected by the type of singing from the supporters, F(2, 18) = 11.24, p = .001. B. The results showed that the number of goals scored was significantly affected by the type of singing from the supporters, F(1.58, 14.19) = 11.24, p = .002. C. The results showed that the number of goals scored was significantly affected by the type of singing from the supporters, F(2, 12.4) = 11.24, p = .001. D. The results showed that the number of goals scored was significantly higher when supporters sang positive songs towards their team than when they sat quietly, F(2, 18) = 11.24, p = .001.
A = Mauchly’s test was non-significant, so we can report the result in the row labelled ‘sphericity assumed’
971
Imagine we were interested in the effect of supporters singing on the number of goals scored by soccer teams. We took 10 groups of supporters of 10 different soccer teams and asked them to attend three home games, one at which they were instructed to sing in support of their team (e.g., ‘Come on, you Reds!’), one at which they were instructed to sing negative songs towards the opposition (e.g., ‘You’re getting sacked in the morning!’) and one at which they were instructed to sit quietly. The order of chanting was counterbalanced across groups. An ANOVA with a simple contrasts using the last category as a reference was conducted. Looking at the output tables below, which of the following sentences regarding the contrasts is correct? a.The first contrast revealed that soccer teams scored significantly more goals when their supporters sang positive songs compared to when they did not sing. The second contrast revealed that soccer teams scored significantly fewer goals when their supporters sang negative songs compared to when they did not sing. b. The first contrast revealed that soccer teams scored significantly fewer goals when their supporters did not sing compared to when they sang negative songs. The second contrast revealed that soccer teams scored a similar amount of goals when their supporters sang positive songs compared to when they did not sing. c. The first contrast revealed that soccer teams scored significantly more goals when their supporters sang positive songs compared to when they did not sing. The second contrast revealed that soccer teams scored significantly fewer goals when their supporters sang negative songs compared to when they sang positive songs. d. The first contrast revealed that soccer teams scored significantly more goals when their supporters sang positive songs compared to when they did not sing. The second contrast revealed that soccer teams did not significantly differ in the number of goals scored when their supporters sang negative songs compared to when they did not sing.
a = see from the means in the Descriptive Statistics table that positive singing resulted in the highest number of goals scored and negative singing resulted in the least number of goals score
972
An experiment was done to compare the effect of having a conversation via a hands-free mobile phone, having a conversation with an in-car passenger, and no distraction (baseline) on driving accuracy. Twenty participants from two different age groups (18–25 years and 26–40 years) took part. All participants in both age groups took part in all three conditions of the experiment (in counterbalanced order), and their driving accuracy was measured by a layperson who remained unaware of the experimental hypothesis. Which of the following sentences is the correct interpretation of the main effect of distraction? AThere was a significant main effect of distraction, F(2, 36) = 45.95, p < .001. This effect tells us that if we ignore the effect of age, driving accuracy was significantly different in at least two of the distraction groups. B. There was no significant main effect of distraction, F(2, 36) = 45.95, p = .719. This effect tells us that if we ignore the effect of age, driving accuracy was the same for no distraction, hands-free conversation and in-car passenger conversation. C. There was a significant main effect of distraction, F(2, 36) = 45.95, p < .001. This effect tells us that driving accuracy was different for no distraction, hands-free conversation and in-car passenger conversation in the two age groups. D. There was no significant main effect of distraction, F(2, 36) = 45.95, p > .05. This effect tells us that none of the distraction groups significantly distracted participants across both age groups.
A = We can read the results in the row labelled ‘sphericity assumed’, as we can see from the output of Mauchly’s test that the assumption of sphericity has been met, p > .05. However, we would need to do some follow-up tests to investigate exactly where the differences between groups lie
973
Field and Lawson (2003) reported the effects of giving children aged 7–9 years positive, negative or no information about novel animals (Australian marsupials). This variable was called ‘Infotype’. The gender of the child was also examined. The outcome was the time taken for the children to put their hand in a box in which they believed either the positive, negative, or no information animal was housed (positive values = longer than average approach times, negative values = shorter than average approach times). Based on the output below, what could you conclude? A. Approach times were significantly different for the boxes containing the different animals, but the pattern of results was unaffected by gender. B. Approach times were significantly different for the boxes containing the different animals, and the pattern of results was affected by gender. C. Approach times were not significantly different for the boxes containing the different animals, but the pattern of results was affected by gender. D.Approach times were not significantly different for the boxes containing the different animals, but the pattern of results was unaffected by gender.
A
974
What leads to chi-squared test?
Q: What sort of measurement? A: Categorical (in this case counts or frequencies) Q:How many predictor variables? A: One Q: What type of predictor variable? A: Categorical Q: How many levels of the categorical predictor? A: Not relevant Q: Same or Different participants for each predictor level? A: Different This leads us to and Chi-square test for independence of groups
975
In chi-square test, participants is allocated to one and only one category such as - (3)
pass or fail, pregnant or not pregnant, win, draw or lose
976
Since each participant is allocated to one category in chi-squared test each individual therefore
contributes to the frequency or count with which a category occurs
977
Table scenario in which cats can be trained to dance more effectively with food or affection at reward - chi-squared test
978
Table scenario in which cats can be trained to dance more effectively with food or affection at reward - chi-squared test what are the four categories? - (4)
* could they dance - yes * could they dance - no * food as reward * affection as reward
979
Table scenario in which cats can be trained to dance more effectively with food or affection at reward - chi-squared test highlight the frequencies for four categories
980
Table scenario in which cats can be trained to dance more effectively with food or affection at reward - chi-squared test what do the rows give?
Row totals give frequencies of dancing and non-dancing cats
981
Table scenario in which cats can be trained to dance more effectively with food or affection at reward - chi-squared test what do the columns give? - (2)
The column totals give frequencies of food and affection as reward These are the numbers in each group
982
IV and DV in chi-squared tests - (2)
One categorical DV (because of frequencies) with one categorical IV with different participants at each predictor level
983
In chi-squared categorical outcomes, the null hypothesis is set
up on the basis of expected frequencies, four all four variable combinations, based on the idea that the variable of interest has no effect on frequencies
984
What does the chi-square tests?
whether there is a relationship between two categorical variables.
985
In chi-square since we are using categorical variables we can not use
mean or any similar statistic hence cannot use any parametric tests
986
What does chi-square compare?
observed frequencies from the data with frequencies which would be expected if there was no relationship between the two variables.
987
In chi-square test when measuring categorical variables we are interested in
frequencies (number of items that fall into combination of categories)
988
Example of scenario using chi-square
We have a list of movie genres; this is our first variable. Our second variable is whether or not the patrons of those genres bought snacks at the theatre. Our idea (or, in statistical terms, our null hypothesis) is that the type of movie and whether or not people bought snacks are unrelated. The owner of the movie theatre wants to estimate how many snacks to buy. If movie type and snack purchases are unrelated, estimating will be simpler than if the movie types impact snack sales.
989
What is assumptions of chi-square test? - (3)
Data values that are a simple random sample from the population of interest. Two categorical or nominal variables. Don't use the independence test with continuous variables that define the category combinations. However, the counts for the combinations of the two categorical variables will be continuous. For each combination of the levels of the two variables, we need at least five expected values. When we have fewer than five for any one combination, the test results are not reliable
990
We have a list of movie genres; this is our first variable. Our second variable is whether or not the patrons of those genres bought snacks at the theatre. Our idea (or, in statistical terms, our null hypothesis) is that the type of movie and whether or not people bought snacks are unrelated. The owner of the movie theatre wants to estimate how many snacks to buy. If movie type and snack purchases are unrelated, estimating will be simpler than if the movie types impact snack sales. Is the Chi-square test of independence an appropriate method to evaluate the relationship between movie type and snack purchases? - (3)
We have a simple random sample of 600 people who saw a movie at our theatre. We meet this requirement. Our variables are the movie type and whether or not snacks were purchased. Both variables are categorical. But last requirement is for more than five expected values for each combination of the two variables. To confirm this, we need to know the total counts for each type of movie and the total counts for whether snacks were bought or not. = check later
991
We have a list of movie genres; this is our first variable. Our second variable is whether or not the patrons of those genres bought snacks at the theatre. Our idea (or, in statistical terms, our null hypothesis) is that the type of movie and whether or not people bought snacks are unrelated. The owner of the movie theatre wants to estimate how many snacks to buy. If movie type and snack purchases are unrelated, estimating will be simpler than if the movie types impact snack sales. Diagram of contigency table in Chi-square and calculating row totals and colum and grand total - (7)
50 + 125 + 90 +45 = 310 75 + 175 + 30 + 10 = 290 50 + 75 = 125 125 + 175 = 300 90 + 30 = 120 45 + 10 = 55 310 + 290 = 600
992
How to calculate chi-square test statistic? - (4)
1. Calculate the difference from actual and expected for each Movie-Snacks combination. 2. square that difference. 3. Divide by the expected value for the combination. 4. We add up these values for each Movie-Snacks combination. This gives us our test statistic.
993
We have a list of movie genres; this is our first variable. Our second variable is whether or not the patrons of those genres bought snacks at the theatre. Our idea (or, in statistical terms, our null hypothesis) is that the type of movie and whether or not people bought snacks are unrelated. The owner of the movie theatre wants to estimate how many snacks to buy. If movie type and snack purchases are unrelated, estimating will be simpler than if the movie types impact snack sales. Diagram of contigency table in Chi-square of calculating eexpected counts
e.g., for action and snacks it would be column total (310) * row total (125) divided by grand total of 600 = 65
994
Example of calculating chi-square from table
For this it would be 65.03
995
Does the area of psychology that a person prefers depend on whether they would select a cat or a dog as a pet? - chi-square test of independence Chi-square example we need to check the assumptions below - (2)
Independence Each item or entity contributes to only one cell of the contingency table. The expected frequencies should be greater than 5. In larger contingency tables up to 20% of expected frequencies can be below 5, but there is a loss of statistical power. Even in larger contingency tables no expected frequencies should be below 1.
996
How to understand your test statistic from chi-squared? - (5) if you have test statistic of 65.03
1. Set your significance level = .05 2. Calculate the test statistic -> 65.03 3. Find your critical value from chi-squared distribution table based on df & significance level 4. Degrees of freedom: df (r – 1) x (C-1) For the movie example this is; Df = (4-1) x (2-1) = 3 -> 7.815 5. compare test statistic with critical level 65.03 > 7.82 so reject the idea that movie type and snack purchases are independent
997
Example of research question and hypothesis and sig level of chi-square test of independence- (4)
Research question: Does the area of psychology that a person prefers depend on whether they would select a cat or a dog as a pet? Hypotheses: H0: The area of interest in psychology and type of pet preferred are independent of each other. H1: The area of interest in psychology and type of pet preferred are not independent of each other. That is the primary area of interest in psychology depends on whether you prefer a cat or a dog. Significance level: α = .05
998
Does the area of psychology that a person prefers depend on whether they would select a cat or a dog as a pet? - chi-square test of independence Chi-square example we need to check the assumptions of The expected frequencies should be greater than 5. What does it show? - (4)
Here we see that all the expected counts in the cat group and one expected count in the dog group are below 5. We also have one in the cat group that is below 1. So, SPSS has flagged that we have 60% of the expected counts falling below 5. So assmption of expected frequencies greater than 5 is not assumed
999
If chi-square assumption that The expected frequencies should be greater than 5 is not satisfied then do - chi-square test of independence
We should use Fisher’s Exact Test which can correct for this.
1000
Does the area of psychology that a person prefers depend on whether they would select a cat or a dog as a pet? - - chi-square test of independence If assumptions were met (expected frequencies greater than 5) then.. report - (2)
A chi-square independence test was performed to examine whether there was a relationship between their area of studies in psychology and their preference for cats or dogs. The relationship between these variables was not significant, χ²(4, N = 46) = 1.46, p = .834, so we fail to reject H0.
1001
Are directional hypotheses possible with chi-square? A.Yes, but only when you have a 2 × 2 design. B.Yes, but only when there are 12 or more degrees of freedom. C.Directional hypotheses are never possible with the chi-squared test. D.Yes, but only when your sample is greater than 200.
A = only when you have 2 variables to compare and can't do non-directional in chi-square have to use loglinear or goodness of fit tests
1002
Example situations you can do chi-square directional and not possible - (5)
If we are just comparing pet preferences between males and females, we can make a directional hypothesis (2 x 2 – male/female, cats/dogs). Males prefer cats or females prefer dogs. However, when we start adding variables to the design it gets complicated. If we wanted to compare drink preferences at different times of the day for students/lecturers, we couldn’t form a directional hypothesis. This is because we have 3 main effects and several interactions to consider. We need to use loglinear analyses to do this.
1003
Loglinear analysis is a .... of chi-square
extension
1004
Chi-square only analyses two variables at a time, whilst log-linear models
can determine complex interactions in multidimensional contingency tables with more than two categorical variables.
1005
Loglinear is appropriate when
there’s no clear distinction between response and explanatory variables
1006
think of
Think of chi-square like t-tests (2 groups) and log-linear like ANOVA (more than 2 groups).
1007
Example of RQ, hypothesis and sig level of loglinear - (3)
Research question: Is the new treatment associated with improvements in health in cats and dogs? Hypotheses: H0: Treatment, type of animal and improvements are independent of each other. H1: Treatment, type of animal and improvements are associated with each other. Significance level: α = .05
1008
Assumptions of log linear - (2)
Independence Expected counts > 5
1009
Research question: Is the new treatment associated with improvements in health in cats and dogs? Checking assumption of expected counts - (3):
Here we have 3 things we are comparing: animal (cat/dog), treatment (yes/no) and improvement (yes/no) all of which are categorical. We look and see that all of the expected counts are above 5. So met assumption of independence and expected counts
1010
Research question: Is the new treatment associated with improvements in health in cats and dogs? Here we have 3 things we are comparing: animal (cat/dog), treatment (yes/no) and improvement (yes/no) all of which are categorical. In loglinear model selection it begins with - (2)
all terms present (all main effects and all possible interactions main effects: Animal, Treatment and Improvement interactions: Animal * Treatment, Animal * Improvement, Treatment * Improvement and Treatment* Animal* Improvement
1011
Research question: Is the new treatment associated with improvements in health in cats and dogs? Here we have 3 things we are comparing: animal (cat/dog), treatment (yes/no) and improvement (yes/no) all of which are categorical. In loglinear model selection after including all main effects and interactions then it - (4)
Remove a term and compares the new model with the one in which the term was present. Starts with the highest-order interaction (including max number of variables/categories) Uses the likelihood ratio to ‘compare’ models below: If the new model is no worse than the old, then the term is removed and the next highest-order interactions are examined, and so on.
1012
Model selection of loglinear - what does it show? Research question: Is the new treatment associated with improvements in health in cats and dogs? Here we have 3 things we are comparing: animal (cat/dog), treatment (yes/no) and improvement (yes/no) all of which are categorical. - (3)
We can see that the model selection worked in a way that it first tried to remove the 3-way interaction. However, we can see here that it * affected the fit of the model, so it was left in. Since removing the highest-order interaction made a * difference to the fit of the model, we get a final model that is the saturated model (it contains all main effects and interactions).
1013
Loglinear SPSS K way and Higher order effects what does it show? Research question: Is the new treatment associated with improvements in health in cats and dogs? Here we have 3 things we are comparing: animal (cat/dog), treatment (yes/no) and improvement (yes/no) all of which are categorical. - (2)
we are using the likelihood ratio here because that’s how we compare the models to find the best fit . We see that all main effects and interactions are significantly contributing to explaining the variance in the data
1014
loglinear what does K represent and what does K = 1,2 and 3 represent? - (4)
K represents the level of the terms. For example, K=1 would be the main effects, K=2 would be our 2-way interactions and K=3 is our 3-way interaction.
1015
Loglinear SPSS - what does parameter estimates show? Research question: Is the new treatment associated with improvements in health in cats and dogs? Here we have 3 things we are comparing: animal (cat/dog), treatment (yes/no) and improvement (yes/no) all of which are categorical. - (3)
There is a significant three-way interaction between animal, treatment and improvement, as well as two significant two-way interactions between animal and improvement and treatment and improvement (p < .001) a * 3-way interaction between animal, treatment and improvement as well as two * 2-way interaction between animal/improvement and treatment/improvement. Like our post-hoc tests, this is telling us where the * differences are.
1016
Loglinear after seeing statistical tests we go to raw data showing that...
Based on the raw data, there seems to be indication that the cats responded better to treatment than dogs, this should be followed up by chi-square tests separately for cats and dogs to determine whether the association between treatment and improvement is present in both cats and dogs
1017
When conducting a loglinear analysis, if our model is a good fit of the data then the goodness-of-fit statistic for the final model should be: A. Significant (p should be smaller than .05) B. Non-significant (p should be bigger than .05) C. Less than 5 but greater than 1 D. Greater than 5
B
1018
The goodness of fit tests in log linear tests
hypothesis that frequencies predicted by model (expected frequencies) are sig different from actual frequencies in data (obsevered)
1019
A significant goodness of fit result mean
our model was significantly different from our data (i.e., the model is a bad fit to the data).
1020
A recent story in the media has claimed that women who eat breakfast every day are more likely to have boy babies than girl babies. Imagine you conducted a study to investigate this in women from two different age groups (18–30 and 31–43 years). Looking at the output tables below, which of the following sentences best describes the results? = chi-square A. Women who ate breakfast were significantly more likely to give birth to baby boys than girls. B. There was a significant two-way interaction between eating breakfast and age group of the mother. C. Whether or not a woman eats breakfast significantly affects the gender of her baby at any age. D. The model is a poor fit of the data.
C
1021
Chi square and log linear are both
non-parametric methods
1022
Non-parametric tests used when
When data violate the assumptions of parametric tests we can sometimes find a nonparametric equivalent eg. normality of distribution
1023
Non-parametric tests work on the principle of
randomization or ranking the data for each group
1024
Ranking data gets rid of in non parametric
outliers and skew
1025
How does ranking work in non-parametric? - (2)
Add up the ranks for the two groups and take the lowest of these sums to be our test statistic The analysis is carried out on the ranks rather than the actual data.
1026
Non-parametric equivalent of independent/unrelated t-tests
Mann-Whitney or Wilcoxon rank-sum test
1027
Non-parametric equivalent of repeated t-test
Wilcoxon signed-rank test
1028
Non-parametric equivalent of : One-way independent (between-subjects) ANOVA
Kruskall-Wallis or (for trends) Jonckheere-Terpstra
1029
Non-parametric equivalent of one-way repeated ANOVA
Friedmanʼs ANOVA
1030
Non-parametric equivalent of Multi-way between or within-subjects ANOVA
Loglinear analysis (categorical outcome, with participants as a factor)
1031
Non-parametric equivalent of correlation
Spearman’s Rho or Kendall’s Tau
1032
Mann-Whitney/Wilcoxon rank-sum Test - Compares
two independent groups of scores
1033
Wilcoxon signed rank Test - Compare
two dependent groups of scores
1034
Kruskal-Wallis Test - Compares
> 2 independent groups of scores
1035
Friedman’s Test - Compares
> 2 dependent groups of scores
1036
Spearman’s Rho & Kendall’s Tau - Measures the extent to which
two continuous variables are related (pattern of responses across variables)
1037
Logic behind Wilcoxon's rank sum test, what does SPSS do? - (3)
Step 1: Get some not normally distributed data Step 2: Rank it (regardless of group) Step 3: Significance testing Does one of the groups have more of the higher ranking scores than the other?
1038
What is DF of chi-square?
(r-1)(c-1)
1039
The likelhood ratio in loglinear model preferred
small sample sizes
1040
DF of likelhood ratio in loglinear
df = (r-1)(c-1)
1041
Decision tree of Mann Whitney - (4)
1 DV = Ordinal (e.g., high school, bachelors, order is meaningful) or continous 1 IV = Categorical and 2 levels Different partiicpants Does not meet assumption of parametric
1042
Wilcoxon rank sum and Man Whitney U is
same procedure and used to compare two independent groups and assess whether samples come from same distribution
1043
For Mann-Whitney U/Wilcoxon Rank Sum they comparing 2 independent conditions the two steps - (2)
Rank all the data on the the basis of the scores irrespective of the group compute the sum of ranks of each group
1044
For wilcoxon rank sum, the statistic Ws is
the lower of the two sums of ranks
1045
For Mann-Whitney, the statistic U use the
sum of ranks for group 1, R1, as follows
1046
Example of table where comparing 2 independent conditions of Wilcoxon rank sum or Mann Whitney U test
Here we have data for two groups; one taking alcohol, the other ecstasy. The scores for a measure of depression. Scores were obtained on two days; Sunday and Wednesday. The drugs were administered on Saturday.
1047
Example of table where comparing 2 independent conditions of Wilcoxon rank sum or Mann Whitney U test Here we have data for two groups; one taking alcohol, the other ecstasy. The scores for a measure of depression. Scores were obtained on two days; Sunday and Wednesday. The drugs were administered on Saturday. Two steps for both statistics: Rank all the data on the the basis of the scores irrespective of the group compute the sum of ranks of each group - (5)
The graphic here shows how we can list the scores in order and as a result assign each score a rank. When scores tie, we give them the average of the ranks. If we ensure we keep track of the group the scores came from we can relatively easily add the ranks up for each group. Note, that if there was little difference between the groups the sums of their ranks would be similar, as they are for the data shown her for Sunday. However, the sum of ranks differ considerably for the data obtained on Wednesday.
1048
For Wilcoxon sum of ranks = comparing 2 independent groups the W s statistic, the group sizes of n1 and n2 the mean of W2 is given:
1049
For Wilcoxon sum of ranks = comparing 2 independent groups the W s statistic, the standard error of Ws is given
1050
For Wilcoxon sum of ranks = comparing 2 independent groups the z score of Ws can be calculated
1051
For Mann Whitney , the statistic U use the the sum of ranks for group 1, R1, as follows
1052
For Mann Whitney , the statistic U use the the sum of ranks for group 1, R1, as follows Specificy equation - (2)
The first terms involving n1 and n2 actually compute the maximum possible sum of ranks for group 1. U is zero when all those in group one have scores that exceed the scores of those in group 2.
1053
In Mann Whitney U there is a standardised test statistic which is z score that can allow you to compute
effect size so r = z / square root of N (number of pps
1054
What is decision tree of Wilcoxon signed rank test? - (4)
1 IV categorical with 2 levels Same participants in each predictor level 1 DV - Ordinal or continous Does not meet assumption of parametric tests
1055
Steps of Wilcoxon signed rank test - (4)
1. Compute the difference between scores for the two conditions 2. Note the sign of the difference (positive or negative) 3. Rank the differences ignoring the sign and also exclude any zero differences from the ranking 4. Sum the ranks for positive and negative ranks
1056
Example of Wilcoxon signed rank test carrying out steps - (9)
The table shown here has the Depression Scores taken on Sunday and Wednesday for those taking ecstasy on Saturday. Data for Sunday are in the first column and Wednesday in the second column. The third column shows the difference between scores obtained on Sunday and Wednesday. NOte some could be negative, some positive. In this example however the difference is always positive apart from two values when the difference is zero. The fourth column notes the sign of the difference or notes it is going to be excluded because the difference was zero. The fifth column ranks the differences in terms of their size, but not sign. The sixth and seventh column list the ranks that were for positive and negative differences, respectively. It is these two columns that are summed to get the relevant statistics, called T+ and T-. Because T+ and T- are not independent, we take only the T+ value.
1057
For Wilcoxon signed rank test the group size n the mean T is given
1058
For Wilcoxon signed rank test thestandard error fo T is given:
1059
For Wilcoxon signed rank test compute z score of T by
1060
Kruskal Wallis decision tree like one-way independent ANOVA - (4)
1 DV of continous or ordinal 1 IV categorical predictor of more than 2 levels Diff participants in each predictor level Does not meet assumption of parametric
1061
Kruskal Wallis steps - (2)
Rank all the data on the the basis of the scores irrespective of the group Compute the sum of ranks of each group, Ri , where i is the group number
1062
For Kruskal-Wallis, the statistic H is as follows
1063
What is decision tree of Friedman test? - (4)
1 DV continous or ordinal 1 IV predictor categorical with more than 2 levels Same participants in each predictor level Doesnot meet assumption of parametric tests
1064
What is steps of Friedman test? - (2)
Rank the scores or each individual - that means you will have ranks varying from 1 to the number of conditions the participants took part in Compute the sum of ranks, Ri , for each condition
1065
For Friedman, the statistic F is as follows
K = conditions N = number of pps
1066
Example when using chi-square test - (3)
- In this example, they wanted to look at whether attendance at lectures had an impact on their exam performance on whether they passed or failed - Attendence was coded as 1 if participants generally attended lectures , barirng illness, and 2 if they did not attend - Exam was scored as 1 = Pass and 2 = Fail
1067
- In this example, they wanted to look at whether attendance at lectures had an impact on their exam performance on whether they passed or failed - chi square What does it show? - (4)
- Attendence, Attended Lectures , Count = this is people who attended lecture and number of people who passed was 84 and people who failed was 29 - % Within attendence give same info so 74.3% passed and 25.7% failed when attended lectures -Going to didn’t attend lectures, 22 people passed and 35 failed and below is in percentages: - Easier using percentages writing up
1068
- In this example, they wanted to look at whether attendance at lectures had an impact on their exam performance on whether they passed or failed - chi square What does it show? - (2)
- At top row, pearson chi-squared which chi-square statistic which was 20.617, DF which is 1 and p-value was 0.000 - 0 cells have a count less than 5  met assumption of chi-square test that expected counts greater than 5
1069
- DF is always ... in two-by-two chi-square
1
1070
If SPSS output shows below in chi-square that
0 cells have a count less than 5  met assumption of chi-square test
1071
- In this example, they wanted to look at whether attendance at lectures had an impact on their exam performance on whether they passed or failed - chi square What does it this effect size show? - (2)
- x^2 (1) = 20.62, p < 0.001 - Cramer's V = 0.35 , indicating a medium effect size
1072
Effect size guideline of r correlation coefficient - (3)
Small effect = 0.1 Medium effect = 0.3 Large = 0.5 and above
1073
Cramer's V can be interpreted similar to
correlation coefficient:
1074
In chi-square we can calculate odds
ratio
1075
Example of calculating odds ratio for chi-square - (3)
odds of passing/failing for students who attended lecture = no. of students who attended and passed (84) / no. of students who attedned and failed (29) = 2.897 odds of passing/failing for students who did not attend = no. of students who did not attend lectures and passed (22) / no of students who did not attended and fail (35) Odds ratio = odds of P/F of attended/ odds of P/F of not attended = 2.897/ 0.629 = 4.606 saying for an individual who attended lectures lead them to be more likely to pass exam
1076
Example research scenario of Mann Whitney - (4)
- Independent sample design - One IV, two conditions = existing vs new medication - One DV (symptoms) but this time on ordinal (scale from 1 to 5) and got combination of non-normally distributed data and small sample size (very problematic for t-tests) - Mann Whitney U Test
1077
Example of using Mann Whitney U = skew
1078
What does this Mann Whitney U show? - Independent sample design - One IV, two conditions = existing vs new medication - One DV (symptoms) but this time on ordinal (scale from 1 to 5) and got combination of non-normally distributed data and small sample size (very problematic for t-tests) - Mann Whitney U Test
- This box summarises the p-value ( p = 0.026) and tells you whether to accept or reject the null hypothesis.
1079
What does this output show of Mann Whitney U? - (3)
- Mann Whitney U test statistic is 166.000 to report and also people report standardised test statistic 2.292 which is z score so handy to report as know if its above +/-1.96 then p-value we get out of test is significant - P-value of exact significant is p = 0.026 - This is significant difference between the 2 groups
1080
- Next we would want to look at the median scores to see which group is scoring highest and lowest after sig Mann Whitney U test What does this output show? - (3) - Independent sample design - One IV, two conditions = existing vs new medication - One DV (symptoms) but this time on ordinal (scale from 1 to 5) and got combination of non-normally distributed data and small sample size (very problematic for t-tests) - Mann Whitney U Test
- For existing treatment, median score was 3. And new treatment the median score was 4. It suggests new treatment was more effective in reducing symptons than the existing treatment
1081
Example research scenario of Friedman ANOVA - (5)
- Again we got ordinal data for DV not sure distances between levels is going to be the same - Related design - One IV, 3 conditions - One DV (level reached to video game) - Friedman’s ANOVA = more than 2 groups in related design
1082
What does this Friedman ANOVA output show?
- We got total sample size which is 30 and test statistic which is 21.788, DF = 2 and p value is 0.000 so significant difference between the 3 groups
1083
For Friedman's ANOVA we do
post hoc tests for pairwise comparisons to look where the differences are
1084
What do this Friedman ANOVA test post hoc tests show? - (7)
- First one is Joy stick vs Vyper Max - Second one is Joystick vs Evo Pro etc… - Notice it gives two p-values of sig and adjusted sig - Adjusted sig control for multiple comparison and make correcitons to p-value (use this - Difference between joystick vs Vyper Max was sig at p = 0.005 - Difference between Joystick vs Evo Pro was sig at p = 0.00 - Difference between VyperMax vs Evopro is non-significant as p = 0.660
1085
- The problem with non-parametric tests is that they have less power
to detect sig effects compared to parametric effects so maybe issue of dealing with power so may have median scores higher in one then another but not sig
1086
Non-parametric tests are used when A. The assumptions of parametric tests have not been met. B. You want to increase the power of your experiment. C. You have more than the maximum number of tied scores in your data set. D. All of these.
A = non parametric have fewer assumptions than parametric
1087
With 2  2 contingency tables (i.e., two categorical variables both with two categories) no expected values should be below ____. A. 5 B. 1 C. 0.8 D. 10
A
1088
Which of the following statements about the chi-square test is false? A. Which of the following statements about the chi-square test is false? B. The chi-square test can be used to check how well a model fits the data. C. The chi-square test is used to quantify the relationship between two categorical variables. D. The chi-square test is based on the idea of comparing the frequencies you observe in certain categories to the frequencies you might expect to get in those categories by chance.
A = correct, because it is false. Chi-square can be used on categorical variables only
1089
When conducting a loglinear analysis, if our model is a good fit of the data then the goodness-of-fit statistic for the final model should be: Hint: The goodness-of-fit test tests the hypothesis that the frequencies predicted by the model (the expected frequencies) are significantly different from the actual frequencies in our data (the observed frequencies).) A. Non-significant (p should be bigger than .05) B. Significant (p should be smaller than .05) C.Greater than 5 D. Less than 5 but greater than 1
A = If our model is a good fit of the data then the observed and expected frequencies should be very similar (i.e., not significantly different
1090
What is the parametric equivalent of the Wilcoxon signed-rank test? A. The paired samples t-test B. The independent t-test C. Independent ANOVA D. Pearson’s r correlation
A
1091
Are directional hypotheses possible with chi-square? A. Yes, but only when you have a 2 × 2 design. B. Yes, but only when there are 12 or more degrees of freedom. C. Directional hypotheses are never possible with the chi-squared test. D. Yes, but only when your sample is greater than 200.
A =
1092
A psychologist was interested in whether there was a gender difference in the use of email. She hypothesized that because women are generally better communicators than men, they would spend longer using email than their male counterparts. To test this hypothesis, the researcher sat by the computers in her research methods laboratory and when someone started using email, she noted whether they were male or female and then timed how long they spent using email (in minutes). How should she analyse the differences in males and females (use the output below to help you decide)? A. Mann–Whitney test B. Paired t-test C.Wilcoxon signed-rank test D. Independent t-test
1093
What is the Jonckheere–Terpstra test used for? A. To test for an ordered pattern to the medians of the groups you’re comparing. B. To test whether the variances in your data set are approximately equal. C. To test for an ordered pattern to the means of the groups you’re comparing. D.To control for the familywise error rate.
A
1094
f the standard deviation of a distribution is 5, what is its variance?
25 = 5^2
1095
If standard deviation of distribution is 5, what is its variance?
5^2 = 25
1096
A distribution with positive kurtosis (leptokurtic) indicates that: A Scores are tightly clustered around the centre of the distribution B Scores are spread widely across the distribution C Scores are clustered towards the left side of the distribution D Scores are clustered towards the right side of the distribution
A
1097
If the scores on a test have a mean of 28 and a standard deviation of 3, what is the z-score for a score of 34? A 3 B 2 C -2 D -3.42
B = 34-28/3
1098
Question 4 Which of the following is an assumption of a one-way repeated measures ANOVA but not a one-way independent ANOVA? A Homogeneity of variance B Homogeneity of regression slopes C Sphericity D Multicollinearity
C
1099
A test statistic with an associated p value of p = .002 tells you that: A The statistical power of your test is large B The probability of getting this result by chance is 0.2%, assuming the null hypothesis is correct C The effect size of this finding is large D All of the above
B
1100
Question 6 Of the following, which is the most appropriate reason to use a non-parametric test? A When the DV is measured on an ordinal scale B When you have unequal sample sizes between conditions of the IV C When the sample size is small D When you have a violation of the assumption of homogeneity of variance
A
1101
Question 7 The following are all commonly stated assumptions/requirements for using ANOVA. Which of the 4 is the only one that the procedure always requires? A Subjects are assigned to treatment conditions / groups using random allocation B Data is from a normally distributed population C DV is continuous (interval or ratio) D Variance in each experimental condition is similar (assumption of homogeneity of variance)
C
1102
Question 8 A researcher runs a single t test and obtains a p value of p = .04. The researcher rejects the null hypothesis and concludes that there is a significant effect of the experimental manipulation in the population. Which of the following are possible? A The researcher may have made a type 1 error B The researcher may have made a type 2 error C The researcher may have made a familywise error D All of the above are possible
A
1103
Question 9 99% of z-scores lie between: A  1.96 B  2.58 C  3.29 D  1
B
1104
Question 10 If predictor X shows a correlation coefficient of -.45 with outcome Y, we can confidently say that: A X is a significant predictor of Y B Variance in X accounts for 20.25% (that’s -.45²) of the variance in Y C X has a causal relationship with Y D All of the above
B
1105
Question 11 How much variance has been explained by a correlation of r = .50? A 10% B 25% C 50% D 70%
B = 0.50 squared
1106
Question 12 The relationship between two variables partialling out the effect that a third variable has on both of those variables can be expressed using a: A Bivariate correlation B Semi-partial correlation C Point-biserial correlation D Partial correlation
D
1107
Question 13 A regression model in which variables are entered into the model on the basis of a mathematical criterion is known as a: A Forced entry regression B Hierarchical regression C Stepwise regression D Logistic regression
C
1108
Question 14 In the following regression equation, what does the parameter b_0 indicate? A The predicted value of the outcome variable B The regression slope C The intercept D Error variance
C
1109
Question 15 In multiple regression, a high VIF statistic, a low tolerance statistic, and substantial correlations between predictor variables, ALL indicate: A Multicollinearity B Heteroscedasticity C The presence of outliers D Non-normality of the residuals
A
1110
Question 16 In a multiple regression model, the t test statistic can be used to test: A Differences between group means B The significance of the overall model C The significance of the regression coefficients for each predictor D The t test statistic is not used in multiple regression
C
1110
Question 17 A Mixed ANOVA design would be appropriate for which of the following situations? A Different participants are tested in each condition B All participants are tested in all conditions C Participants are tested in all conditions for at least one IV, and different participants are tested in each condition for at least one IV D None of the above
C
1111
Question 18 In a one-way independent ANOVA with 40 participants and 5 conditions of the IV, what are the degrees of freedom for the between-groups Mean Squares (MSbetween)? A 4 B 5 C 35 D 40
A = k (number of grps) - 1 = 5 - 1 = 4
1112
Question 19 In a two-way ANOVA there are: A Two IVs and two DVs B Two IVs and one DV C One IV and two DVs D None of the above
B
1113
Question 20 In a two-way factorial design, the SSR (residual sum of squares) consists of: A Variance due to the independent variables and their interaction B Variance due to the independent variables, dependent variable(s) and error variance C Variance accounted for by the interaction only D Variance which cannot be explained by the independent variables
D
1114
Question 30 Statistics enthusiast and Dub Reggae legend ‘Mad Professor’ conducted a study into the effects of listening to music on a memory task. He ended up with three independent variables and one dependent variable, and he wished to analyse all possible main effects and interaction effects. How many model effects in total will he have? A 1 B 3 C 6 D 7
C = 2^k - 1 = 2^3 - 1 = 7
1115
Question 24 A nutritionist was interested in the effectiveness of two of the latest fad diets. The nutritionist took 30 people who wanted to lose weight and allocated them to either the SuperScienceMaxPro weight loss regime, or the SensiNutriPlus diet. He recorded their weight at 4 time points. (The start of the diet, and then every month after that for 3 months). In addition, the nutritionist was interested in whether males and females would differ in recorded weight loss over the 4 time points. What is the design of this study? A Two factorial with one independent factor and one repeated measures factor B Three-factorial with two independent factors and one repeated measures factor C Three-factorial with one independent and two repeated measures factors D Four-factorial with two independent factors and 4 repeated measures factors  
B
1116
Question 26 What is the non-parametric equivalent of a one-way repeated measures ANOVA? A Wilcoxon sign test B Mann-Whitney U test C Kruskall-Wallis test D Friedman test
D
1117
Question 27 What is a limitation of the Chi-square test? A It cannot be used when you have more than 2 categorical variables B Directional hypotheses are not possible when you have more than two conditions of a variable C A small sample size can result in an unreliable test statistic D All of the above
D
1118
1119
Distribution of z