Why do formal research in psychology?
Because it enables use to test and justify our claims and to discard propositions that are false. Research is what distinguishes psychology from other bases for claims about the causes of behaviour, and the effectiveness of treatment.
Why is it important that the details of a study's methods, analysis and results are published?
Our conclusions can be scrutinised by others. We can determine the justifiability of other's conclusions. It allows us to judge the strength of effects It makes us and other psychologists accountable.
Why is it important to learn about research methods even if we don't intend to do research?
It allows us to be discerning consumers of the products of others research.
The truthfullness of the results and conclusions that derive from a piece of research depends on... ?
The quality of the design The quality of the measures The appropriateness of the statistical analyses The nature (e.g., size) of effects The match between the results and the conclusion
What are the three essential conditions for establishing causation?
1. Covariation: two events must occur/change together (including nonoccurrence) 2. Temporal order: the cause must precede the effect 3. Elimination of plausible alternative explanations for covariation
If the only substantial difference between 2 conditions is A (an independent variable) does that allow us to determine that A is the cause of B (the dependent variable)
Yes.
What is Internal Validity?
The extent to which we can say that (any) effects on the DV were caused by the IV (as opposed to something else)
What is selection threat? How can it be controlled?
Anything that can result in nonequivalent groups in different conditions (it reflects the method of condition assignment) (A threat to internal validity) It can be controlled with Random Assignment.
What is History Threat? How can it be controlled?
Another event could coincide with a study and affect its results. (A threat to internal validity) It can be controlled with random assignment
What is Maturation? How can it be controlled?
Refers to changes in participants as a result of the passage of time. (A threat to internal validity) It can be controlled with randomisation and a control group.
What is regression towards the mean? How can it be controlled?
Refers to the fact that scores that are very different from the mean on a first measurement occasion can be nearer on the mean when the measure is taken again.
What is selection threat? How can it be controlled?
Selection is anything that can result in nonequivilant groups in different conditions. Any nonrandom method of allocating participants to conditions leads to selection threat. E.g., testing YOGA program on workers in a large company whereby volunteers go to the yogacondition and nonvolunteers go to the control group. IT can be controlled by random allocation (of sufficient numbers), because this should produce equivalent groups. Note: DON'T CONFUSE this with ways of 'selecting' people from a population from a study
What is a confound?

Another variable that is RELATED to the IV (e.g., differs between conditions).

Not just any extraneous/uncontrolled variable that could apply randomly to different individuals regardless of condition. (e.g., NOT, 1 person breaks a leg, 1 person gets promotion)
e.g., rats have tail removed die and rats who don't have their tail remove survive......do rats need their tail to survive? NO, surgery and human handling are confounds.
What is instrumentation threat?
Changes in the way a measure is applied or intepreted.
e.g., an experimenter gets better at encoding a behaviour, or a patient gets less diligent over time recording in a diary
What are testing effects?
Changes that occur in a measure/score over 2+ testing occassions, BECAUSE of the earlier measure.
What are demand effects?
Responding in a way that is consistent with the researcher's expectations as a result of participants knowing/guessing what the researcher expects.
What are nonspecific treatment effects?
Benefits of warmth, attention, support, reassurance and the mechanics of homework exercises (rather than from the specific treatment itself)
What are placebo effects?
Changes owing to the receipt of a treatment (not owing to the actual treatment). People can feel better just because they know they're receiving a treatment. Placebo effects result from participants' own expectations about what will happen.
What is attrition?
(aka 'mortality') drop outs (people who leave the study before all of the intended data are collected from them). Some attrition is common in studies that use repeated measures over time. It's not necessarily a problem: it depends on the pattern and possible causes.
What is experimenter bias?
Researchers might inadvertently administer experimental instructions differently in different conditions, or administer or interpret measures differently in different conditions (this only applie when discretion/judgment is used in determining scores)
How do we (in general) control for threats to internal validity?

Ensure that threats apply similarly in all conditions.

The only difference should be on the IV.

A control condition should be used if appropriate  placebo should be similar to actual treatment

Counterbalencing (when experimental conditions are varied within subjects, the order should be varied when feasible.

Blind assessment prevents administration or intepretation bias

Blind participants

Rigorous training for measures requiring skilled delivery or intepretation.
What are the features of an appropriate placebo?
Is similar to the real treatment in:

Creadibility

Expectations aroused

TIme commitment/effort

Attention, warmth etc from the researcher

Ideally, method of administration (drug/individual)
Do all treatment studies need a placebo?
No
Depends on research question
 relative effects of A and B (comparison study)
 is A better than placebo
 Correlation study  control condition mostly irrelevant
How do we control for researcher/bias (Where applicable)?

Standardisation instructions to participant

Blind assessment/scoring
Very good training in the use of measures requiring skill (to avoid bias, instrumentation effects)
what does it mean to say a threat is controlled?
That the threat applies similarly to differet conditions, or more or less randomly across different people in a study, it has been controlled.
What is a quasiexperimental design? What is it's major flaw?
Ps not randomly allocated to levels of the IVs
 Sometimes because the IV can't be manipulated E.g emplyed vs. unemployed, asthma vs. not
Ps might differ in ways other than the IV
What is a Factorial Design?
2 or more CATEGORICAL (factors) IVs are manipulated (or measured) AND all combnations of the IVs are tested.
"an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors."
How are factorial designs written?
By specifying the no. of levels of each IV
e.g., a 2x2 design has 2 IVs and each has 2 LVLs
How do you determine the number of cells in a factorial design?
Multiply the number of levels of each IV together
e.g.
 2x2 has 4 cells
 2x3 has 6 cells
 2x2x2 has 8 cells i.e., (2x2)x2
 3x3 has 9 cells
What is the differece between 'betweensubjects', 'withinsubjects' and 'mixed' manipulation of an IV?
Betweensubjects = Each Participant is only in one level of the IV
Withinsubjects = Each participant participates in each level of the IV
Mixed = At least one IV is varied between subjects and at least one is varied within subjects (e.g., prepost time)
What are blind designs?
When participants/experimenters/assessors are not aware of condition.
Singleblind = either participants or assesor/experimentor does not know (not both)
Doubleblind = both participant and assessor/experimentor do not know
What is a cover story?
Not telling participants what the hypothesis/aims of the study are
What is counterbalencing?
Equal representations of the possible order of presentaton of conditions in within Ps design.
What are manipulation checks?

If trying to manipulate the mood, then measure the mood.

If using placebo condition to control for expectations, measure expectations. (expectations, optimism, credible)
What are matched groups (i.e., matched pairs)?

Participants in different conditions are matched on potential relevant extraneous variables (e.g., education, age, drug use)

Note: this is rarely needed and can be difficult to intepret
What is construct Validity?

Extent to which measures and manipulation actually reflect the theoretical constructs of interest (both IVs and DVs) .... the accuracy/truthfullness of the measure.
In writing a design statement, what do you need to indicate?

The IVs: give each a name

The number of levels of the IV(s), and their identity

Whether each IV was varied between or within subjects

The DV(s)

Experiment or not (more so what is randomised, what is not, unless obvious)
What are the different types of internal validity?

Face validity

Content validity

Criterion validity (concurrent and predictive)

Convergent validity

Discriminant (or divergent validity)
What is the "operationalisation" of 1. measurement and 2. experimental manipulation?

The precise activities or operations used to measure a construct (e.g., exactly how 'anxiety' or 'happiness' or 'antisocial behaviour' or 'task effort' was defined and measured)

The precise activities or operations used to manipulate an independent variable in an experiment (e.g., exactly how 'positive' versus 'negative' feedback to participants was manipulated)
What is face validity?
Whether the measure appear to be related to construct of interest.
What is Content validity?
The extent to which a measures assesses the range of characteristics thought to represent the construct of interest.
e.g., If we attempted to measure 'honesty' simply by asking the question "have you ever told a lie"? our measure would not have content validity, because there are many forms of dishonesty other than telling a lie.
(in addition, the measure wouldn't have face validity: all of us have told lies at some time, even if it was just to preserve someone's feelings or solely as a child)
What is criterion validity? describe both types?
The relationship between the measure and an external, separate criterion that presumably reflects or depends on the construct.
Concurrent  the correspondence between the measure and a currently available criterion. For example, a measure of honesty, if valid, should differ between convited embezzlers and a group of people that is presumed to be honest (e.g., nuns)
Predictive  The correspondence between the measure and a future criterion, e.g., its ability to predict behaviours in advance/ (this is possibly the strongest evidence of validity)
What is convergent validity?
The extent to which the measures correlates with other variables (usually simply other measures) that are predicted to be related to the construct. It's the degree of overlap with other measures of the same construct.
What is discriminant (or divergent) validity?
The extent to which the measure discriminates between the target construct and other, different constructs. It shouldn't correlate highly with measures of different constructs. For example, a measure of selfesteem should not correlate (or only correlate weakly) with anxiety, social class, IQ, or the age at which someone learned to ride a bicycle.
What is external validity?
The extent to which our results can be regeralised to the relevant population of interent or to other relevant settings or times. This is affected (among other things) by:
 Representativeness of the sample
 Whether influences operate in the testing or measurement situation that do not apply in other settings (or vice versa)
What is ecological validity?
It is related to external validity, although the two are no identical. It is the extent to which the results can be generalised to real life settings.
What factors is reliability affected by?

Whether questions designed to measure the same construct really do

Random measurement error (e.g., mood, misunderstading a question, distraction)

The number of items (larger numbers tend to increase reliability estimates)

Clarity of instructions and questions, and standardisation of instructions.
Does reliability guarantee validity?
NO! can just be consistently (reliably) wrong.
What is the problem with this questionaire question?:
 How close is your relationship with your mother?
very close, quite close, fairly close, not very close, very distant
Unwarranted assumption  some people may not have a mother.
What do descriptive statistics do? and what DON'T they do?
Descriptive statistics allow us to describe and summarise scores on a variable, or the relationship between variables. They are used because they make a set of data easier to understand.
e.g., in the survey, 38% of respondents said that they approved of the job being done by the Prime Minister, 42% said that they disapproved, and 20% were undecided.
They DON'T tell us whether a particular difference or linear relationship could have arisen by chance (Rather than reflecting a 'real' effect)
What do Inferential statistics do? and what DON'T they do?
Inferential statistics simply tell us whether an effect (or relationship) reflects a 'real' effect, or could simply have arisen by chance owing to normal variation in a sample or population. Related to this, they indicate whether we can infer that an effect observed in a sample is also likely to occur in the population of interest.
(there will be some difference between the groups on almost any measure, simply owing to chance)
They DON'T...... t and f values and statistical significance levels (p) do not indicate the size, direction or importance of an effect....this is what descriptive statistics and our judgement does!
In statistics, what is significance?
The significance level or p is the probability that an observed effect arose by chance if there really is no effect.
for example, the p from a ttest is the likelihood that an observed difference between means arose by chance if there really is no difference, If there is little likelihood (less than or equal to 5%: p equal or less that .05), we consider it to be a 'real' (significant) effect. Incidentally, the criterion level of p, p equal or less than .05, is called the alpha level.
What are the three (main) types of descriptive statistics?

Mean (M), Median & mode  typical score (on a continuous measure).

Percentage or proportions  of cases that fall into particular categories or frequencies

Correlation coefficients (r) and regression coefficients (B and Beta)  these describe the nature of a linear relationship between continuous variables. Correlations indicate the direction and strength of the relationship (e.g., .2 is small, .35 is medium, .5+ is large) Regressions coefficients also indicate direction. Bs tell us about the effect in the original scales of measurement. Betas can indicate the relative strength of different predictors. More about these later.
What are indices of variability?
Values that indicate the degree of variation in scores. The standard deviation (SD) is the most interpretable of the measures of variability. When reporting means, give their SDs too.
What is 'variance'? (as an indice of variability)
Variance is roughly the 'average' of all squared deviations from the mean. It's calculated by squaring the difference between each score and the mean, adding the squared differences (squared deviations) together, and dividing by n1. Statistical procedures use the variance, and the SD is calculated from it. However, it's less interpretable or meaningful than SD.
What is percentage (or portion) of variance explained?
The percentage of variance in the DV that can be explained by an IV.
How do you find the percentage or proportion of variance explained for CORRELATIONS?
For correlations (which handle only 2 variables at a time), the proportion of variance explained is simply r2 (i.e., the square of the correlation coefficient).
for example, if the correlation between body weight and dietary intake in calories per day is r=.66, then calorie intake explains (.66x.66=) .4356 (as a proportion) or 43.56% of teh variance in body weight. In other words, calorie intake explains a substantial proprtion of the variance, BUT 56.44% of the variance in weight is left unexplained, and so weight must also be affected by some other variable or variables ( such as genetic predisposition to store fat and to be a particular weight, and exercise)
How do you find the percentage or proportion of variance explained for REGRESSION?
The proportion of variance explained is given in the statistics R2 and/or R2change, which statstical packages such as SPSS report for us.
How do you find the percentage or proportion of variance explained for ANOVA?
There are a number of indices of variance explained. One is eta2 (n2) or the proportion of total variance in the DC that's explained by an IV.
For factorial ANOVAs, SPSS reports partial n2, which is the proportion of variance in the DV that is explained by a particular IV which ignoring the effects of any other IVs in the analysis.
What two assumptions do Parametric tests make? what happens if these assumptions are not mett?

Scores on continuous variables (that are used in the analysis) are normally distributed

The variability (or variance) in scores is similar for different conditions (Levene's test)
If these assumptions are violated (not true), then the outcome of a parametric test can be misleading (wrong). If the data do deviate markedly from these assumptions, we could use a nonparametric test (there are other potential solutions to).
What are the two broad classes of parametric tests?

Tests that assess the significance of the difference between means (ttests and ANOVAs)

Test that assess the significance of the linear relationship between continuous variables (correlations and regressions)
What is the procedure of SD?

Subtract mean from each person score = difference score

square difference scores for each person (gets rids of negatives)

Add them

Divide by N1 to get the rough average of squared devaitions = variance

Squared variance to get SD
When you have a Categorical DV and a Categorical IV what is the appropriate statistical technique and statistical aim?
Chisquare
Compare proportions betwen categories.
If you have a continuous DV a single categorical IV with only 2 levels, what is the appropriate statistical test? what is the statistical aim?
ttest (or oneway ANOVA)
Test difference between means for the 2 levels
If you have a continunuous DV and a single categorical IV with 3 or more levels, what statistical technique would you use? what is the statistical aim?
Oneway ANOVA
Test whether there are differences somewhere among level means.
If you have a continuous DV and two or more categorical IVs, what statistical technique do you use? and what is the statistical aim?
Factorial ANOVA
Test main effects & interaction
If you have continuous DV/s and continuous IV/s what statistical technique should you use? what is the statistical aim?
Regression or correlation
Assess linear relationship
How can you test the assumption of normality (i.e., a bellcurve/normal distribution)?

Plot the scores in a histogram (using SPSS) and use visual inspection.

Also, there are statistic that can be requested that quantify the deviation from normality ....but we won't cover these.
What is the difference between positive and negative skewness (of a graph)? what does it look like and what does it mean?

Negative skew: The left tail is longer; the mass of the distribution is concentrated on the right of the figure. It has relatively few low values. The distribution is said to be leftskewed, lefttailed, or skewed to the left.

Positive skew: The right tail is longer; the mass of the distribution is concentrated on the left of the figure. It has relatively few high values. The distribution is said to be rightskewed, righttailed, or skewed to the right.
To meet the assumption of 'homogeneity of variance' should the Levene's test (and statistical tests of departure) be significant or nonsignificant?
NONSIGNIFICANT
Under what circumstances do hetereogeneity of variance (i.e.., difference between conditions in the variance in scores on the DC) and nonnormal distrubutions for DCs pose a greater threat to the 'validity' of the statistical outcome?

When the number per condition is small

when the numbers per condition are markedly different

The direction of skewness varies between conditions

Heterogeneity of variance and nonnormality coexist
An ANOVA/ttest is used to assess the difference between ____
means
What does 'ANOVA' stand for?
Analysis of Variance  it compares the means (on continous DVs) of conditions (categorical IVs) and considers why they differ.
What parametric test would you use to assess the difference between means where the IV has only 2levels and is varied between Ps
AND
the statistical aim is compare means between levels?
Independent groups ttest (ore oneway ANOVA)
What parametric test would you use to assess the difference between means where the IV has 3 + levels and is varied between Ps
AND
the statistical aim is compare means between levels?
Oneway ANOVA
What parametric test would you use to assess the difference between means where the IV has only 2 levels and is varied within Ps
AND
the statistical aim is compare means between levels?
Paired Samples ttest
What parametric test would you use to assess the difference between means where the IV has only 3 + levels and is varied within Ps
AND
the statistical aim is compare means between levels?
Oneway repeated measures ANOVA
What parametric test would you use to assess the difference between means where there is more than one IV, each IV has 2+ levels and is varied between Ps
AND
the statistical aim is to test main effects and interactions?
Factorial ANOVA (e.g., 2way, 3way)
What parametric test would you use to assess the difference between means where there is more than one IV, each IV has 2+ levels and is varied within Ps
AND
the statistical aim is to test main effects and interactions?
Repeated measures ANOVA
What parametric test would you use to assess the difference between means where there is more than one IV, each IV has 2+ levels and is varied both within and between Ps
AND
the statistical aim is to test main effects and interactions?
Mixed ANOVA (split plot ANOVA)
What is the purpose of error bars?
provide information about variability
represents error in measurement (in estimation of population mean)
indicate whether means likely to be significantly different.
What do 95% confidence interavals show/mean/represent?
they define the range of values within which the true mean is likely to lie on 95% of occasions.
95% CI indicate
 error of measurement (in estimation of population mean)
 whether means are likely to differ significantly
if the confidence interval bars for two means overlap by more than 1/4 is the difference between means likely to be significant or nonsignificant?
Nonsignificant
what do describtive statistic do? what what don't they do?
Allow us to describe and summarise scores on a variable, or the relationship between variables. They are used because they make a set of data easier to understand.
They DON'T tell us whether a particular difference or linear relationshop could have arisen by chance (rather than reflecting a 'real' effect)
What do inferential statistics do, what don't they do?
they do tell us whether an effect or relaionship reflects a 'real' effect, or could simply have arisen by chance owing to normal variation in a sample population.
They don't give any descriptive stats.
What is significance?
The significance level or 'p' is the probability that an observed effect arose by chance if there really is no effect.
What are the three types of descriptive statistics and when are they used?

Mean (M), median & mode  tell us typical score on a continuious measure for comparing means between conditions.

Percentages or proportions of cases that fall into particular categories (or frequencies)  compare between catefories with different no. of cases.

Correlation coefficients (r) and regression coefficients (B and beta)  describe the nature of a linear relationship between continuous variables. direction and strength. Bs tell us about the effect in the original scales of the measurement. Betas can indicate the relative strength of different predictors.
parametric tests assume that.......?

scores on continuous variables (that are used in the analysis) are normally distibuted

the variability (or variance) in scores in similar for different conditions (there are tests available to assess this, including in SPSS
if these assumptions are violated, then the outcome of parametric test can be misleading. If the data do deviate markedly from these assumptions, we could use a nonparametric test.
What are the broad classes/types of parametric tests?

tests that assess the signifiance of the difference between means (ttest and ANOVAs)

Tests that assess the significance of the linear relationship between continuous variables (correlations and regressions)
What are the assumptions of an ANOVA/ttest?

Scores on the DV are (roughly) normally distributed) i.e., bell curve)

Variance or variability is (roughly) similar for the different conditions(the assumption of 'homogeneity of variance')
In order to meet the homogeneity of variance assumption (i.e. in ANOVA) the levene's test should be _____________?
Nonsignificant
In what situations does heterogeneity of variance(different levels of variance between conditions) pose a greater threat to the validity of the statistical outcome?

the n per condition is small

the N^{s }per condition are markedly different

the direction of skewness varies between conditions

Heterogeneity of variance and nonnormality coexist.
What is the appropriate parametric test to assess the difference between means where the IV is varied between subjects and there is one IV with 2 levels?
Independent groups ttest (or oneway ANOVA)
What is the appropriate parametric test to assess the difference between means where the IV is varied between subjects and there is one IV with 3 levels?
Oneway ANOVA
What is the appropriate parametric test to assess the difference between means where the IV is varied within subjects and there is one IV with 2 levels?
Paired samples ttest
What is the appropriate parametric test to assess the difference between means where the IV is varied within subjects and there is one IV with 3+ levels?
Oneway repeated measures ANOVA
What is the appropriate parametric test to assess the difference between means where the IV is varied between subjects and there are 2 IVs with 2+ levels on each?
Factorial ANOVA (2way, 3way etc)
What is the appropriate parametric test to assess the difference between means where the IV is varied within subjects and there are 2 IVs with 2+ levels on each?
Repeated measures ANOVA
What is the appropriate parametric test to assess the difference between means where the IV is varied both between and within subjects (i.e., mixed design) and there are 2 IVs with 2+ levels on each?
mixed ANOVA (split plot ANOVA)
on the ANOVA results table, the degrees of freedom (df) on the same line as the releveant F is the ______groups df. Whilst the df given on the line labelled 'Error' is the ______groups df.
Betweengroups; withingroups
What is a moderator?
The moderator is the variable (another IV) that affects the primary IVDV relationship in which we're interested. In other words, the moderatory affects the relationshop between the key IV and the DV. We say that two IVs interact. We could just as correctly say that the IV and the moderator interact.
(Moderation is closely related to interaction, however it's a more theorestical idea in that it reflects our focus in posing and answering a research question)