Section B: Scientific Processes - A+I Reliability & Validity Flashcards

1
Q

What is meant by the term ‘reliability’?

A

Reliability refers to the CONSISTENCY of data findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the THREE ways of assessing reliability?

A

-Inter-observer reliability (external - whether the data is consistent.) - DATA
-Split-Half (internal consistency)
-Test retest (external) - DATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the term ‘internal reliability’ refer to?

A

Internal reliability assesses the consistency of results across items within a test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the term ‘external reliability’ refer to?

A

External reliability refers to the consistency of the results and the extent to which a measure varies from one use to another – are the results consistent over time?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which method ASSESSES INTERNAL RELIABILITY?

A

The Split-Half Method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the split-half method refer to?

-What does it refer to?
-What does it measure?

A

The split half method refers to the INTERNAL CONSISTENCY OF QUESTIONNAIRES AND TESTS SUCH AS PSYCHOMETRIC TESTS.

It measures THE EXTENT to which all parts of the test CONTRIBUTE EQUALLY TO WHAT IS BEING MEASURED.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the FOUR steps behind the split-half method?

A
  1. SPLIT A TEST INTO TWO HALVES (of 10). For example, one half may be composed of even-numbers while the other half is composed of ODD-NUMBERED questions.
  2. Administer each half TO THE SAME INDIVIDUAL (Qs 1-10 = first half, 11-20 = second half).
  3. REPEAT FOR A LARGE GROUP OF INDIVIDUALS (one participant does first half, followed by second half - REPEAT WITH LARGE SAMPLE).

4 - Look for a positive correlation between the scores for both halves. A CORRELATION OF ROUGHLY 0.8 would be SIGNIFICANT and indicate HIGH internal reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When does the ‘split-half’ technique work best?

A

This technique works best when there are an EVEN number of questions within a ‘test’, but also when the questions on the test measure the SAME construct (eg Authoritarian personality) or knowledge area.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which method assesses EXTERNAL RELIABILITY? What does it specifically assess?

A

The TEST-RETEST method –> assess the external consistency of a test and it measures the stability of a test over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why might test-retest be particularly useful for clinical psychologists?

A

This method is especially useful for tests that measure stable traits or characteristics that aren’t expected to change over short periods.

If it wasn’t for the reliability of such tests some individuals may not be successfully diagnosed with disorders such as depression and consequently will not be given appropriate therapy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why might the timing of a retest be important? What if too little time has lapsed or too much time has lapsed?

A

The disadvantage of the test-retest method is that it takes a long time for results to be obtained. The reliability can be influenced by the time interval between tests and any events that might affect participants’ responses during this interval.

The timing of the re-test is important; if the duration is too brief, then participants may recall information from the first test, which could bias the results.

Alternatively, if the duration is too long, it is feasible that the participants could have changed in some important way which could also bias the results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What would a typical test-retest assessment involve?

A

A typical assessment would involve giving participants the same test on two separate occasions.

If the same or similar results are obtained then external reliability is established, so results from both occasions would be compared and correlated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What sort of correlation would indicate consistency between the two sets of results?

A

STATISTICAL TESTING CAN BE USED TO HELP DETERMINE IF THE TEST HAS INTERNAL AND EXTERNAL RELIABILITY - A CORRELATION OF ROUGHLY 0.8 IS SAID TO BE SIGNIFICANT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

i) What is meant by ‘inter-rater/ observer reliability’?

ii) What is the process behind checking for inter-rater reliability? If data is similar, what can be concluded (what type of reliability is there?)

A

Inter-rater reliability refers to the DEGREE TO WHICH DIFFERENT RATERS GIVE CONSISTENT ESTIMATES OF THE SAME BEHAVIOUR.

It is when a single event is measured simultaneously and independently by two or more trained individuals. If data is SIMILAR, then it has external reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is it important to have OPERATIONALISED categories for inter-observer reliability checks?

A

-If two researchers are observing ‘aggressive behaviour’ of children at nursery they would both have their own subjective opinion regarding what aggression comprises. In this scenario, it would be unlikely they would record aggressive behaviour the same and the data would be unreliable.

However, if they were to OBSERVE the behaviour category of aggression this would be more objective and make it easier to identify when a specific behaviour occurs.

For example, while “aggressive behaviour” is subjective and not operationalized, “pushing” is objective and operationalized. Thus researchers could simply count how many times children push each other over a certain duration of time and results could then be compared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Identify two other methods where inter-rater reliability testing would be important. Explain why?

A

CONTENT ANALYSIS —> multiple coders may be involved in analysing materials (articles etc). Inter-rater reliability ensures that different coders are consistently applying the coding scheme and interpreting the content in a reliable and therefore consistent manner.

IN QUESTIONNAIRES USING THE SPLIT-HALF METHOD –> Inter-rater reliability is used to determine the correlation co-efficient between the scores of the two sets to determine the degree of agreement or consistency between them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What sort of correlation would indicate consistency between two researchers?

A

Statistical testing can be conducted on these correlations to determine the STRENGTH of the agreement and if the test has internal and external reliability.

-A CORRELATION OF ROUGHLY 0.8 IS SAID TO BE SIGNIFICANT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If agreement is not found in terms of internal and external reliability, researchers will seek to improve the reliability of their test.

How can researchers IMPROVE THE RELIABILITY OF A PROCEDURE? (think scripts).

A

1 - Think SCRIPTS –> Procedures have higher reliability when they are STANDARDISED and WELL-DOCUMENTED with instructions of how to REPLICATE the procedure.
E.g. Pre-recorded instructions (standardised).
-This enhances reliability as it allows other researchers to replicate the same study in different environments to check for the consistency of findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can researchers IMPROVE THE RELIABILITY OF A PROCEDURE? (think environment and control).

A

2 - Think ENVIRONMENT AND CONTROL –> Lab experiments tend to have higher reliability than other experimental methods due to the well-controlled environment and subsequent STRICT CONTROL and LIMITING OF EXTRANEOUS VARIABLES.
-Therefore, researchers can be confident that it is the IV (that is being manipulated) which is having an effect on the DV as all other extraneous variables (later confounding) are controlled.
Control could also be improved through the use of a control group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How can researchers IMPROVE THE RELIABILITY OF A PROCEDURE?(think ETHICS)

A

3 - Think ETHICS –> By obtaining informed consent from participants, maintaining their confidentiality, and minimising the risk of psychological harm, researchers take into account ethical considerations.
Open and transparent reporting of the study’s aims, methodology and results in a debrief, or when gaining informed consent contributes to the reliability and integrity of psychological research.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How can psychologists improve the reliability of OBSERVATIONAL RESEARCH?

A

Think in terms of BEHAVIOURAL CATEGORIES –> Clearly OPERATIONALISE THE BEHAVIOURAL CATEGORIES (how you will measure this behaviour).
For example, verbal aggression could be measured based/categorised on the number of times a child swears or insults somebody else in the observation.

Once this is completed, what could researchers do before the study?

-Behavioural categories could be pre-defined (specific and operationalised) using a TOP-DOWN approach in which the categories for data are imposed before the research begins.
E.g. verbal aggression could be split into categories PRIOR to the observation taking place such as ‘SWEARING’ and ‘HARSH INSULTS’ rather than using a bottom-up approach which allows categories to emerge from the content.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How can psychologists improve the reliability of OBSERVATIONAL RESEARCH? (Think in terms of equipment!)

A

Think in terms of equipment —>Randomisation of materials should be done - randomisation is the process of making groups of items random (in no predictable order), like shuffling cards in a card game.

It might also refer to the presentation of trials in an experiment to avoid any systematic errors that might occurs as a result of the order in which the trials take place. This reduces bias as the researcher has no control over the order of items / trials.

Equipment should be kept the same in observations (aided by the use of top-down approach) to ensure consistency in how behaviours are identified and recorded.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How does having multiple researchers perform the same study with the same behavioural categories allow for inter-rater reliability to be increased?

A

Allows inter-rater reliability to be increased as there is LESS ROOM FOR SUBJECTIVITY AS CATEGORIES ARE CLEARLY DEFINED AND OPERATIONALISED which allows testing for consistency of results through replication.

-Using standardised observation protocols and conducting multiple observations (e.g. to test for INTRA-RATER/ INTER-RATER reliability can help establish consistency of findings through replication - a valid method to assess the reliability of a study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

*How can psychologists improve the RELIABILITY of questionnaires?

A

1 - Think in terms of the WEIGHTING OF QUESTIONS:

-Weighting of questions refers to ASSIGNING DIFFERENT LEVELS OF IMPORTANCE/VALUE TO INDIVIDUAL QUESTIONS BASED ON THEIR RELEVANCE to the research topic or construct being measured.

-This means that certain questions may carry more weight or significance in determining the overall score/outcome of the questionnaire.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How can psychologists improve the RELIABILITY of interviews? - What type of interviews are most reliable?

A

Structured interviews would be the most reliable because they follow a PRE-DETERMINED set of questions with fixed response options. This consistency in questions allows for standardised questioning and reduces bias in responsible.

Conversely, unstructured interviews offer researchers and participants more flexibility and more chance to gather qualitative data through follow up questions. However, they are also LESS RELIABLE due to the lack of STANDARDISED questioning.

26
Q

How can psychologists improve the RELIABILITY of interviews? (THINK equipment).

A

-Randomisation of equipment in psychology in interviews improves reliability by minimising potential bias with the materials used or confounding factors related to specific equipment.
-When different pieces of equipment are randomly assigned to participants, any systematic effects associated with a particular piece of equipment are distributed evenly across the sample to ensure that the equipment itself does not influence the response or outcomes of the interviews.

FOR EXAMPLE…
-If researchers were investigating the effects of different interview settings on participant responses, they could randomly assign ppts to either an in-person interview with a traditional interview OR a remote interview using a headset.
-By randomising the equipment, researchers can ensure that any observed differences in the responses between the two groups are not solely due to the specific equipment used.

27
Q

How can psychologists improve the RELIABILITY of interviews? (THINK IN TERMS OF WHO IS CONDUCTING).

A

Training —> Providing comprehensive training to interviewers is crucial. Teaching them about interview techniques, ethical considerations and establishing rapport with participants.

-Standardisation also ensures that interviews follow a standardised procedure (process in which procedures are used in research are kept the same - great attention is taken to keep all elements of the procedure identical) which maintains consistency of interviews) –> This includes using the SAME set of questions, prompts, and response options for ALL participants.

28
Q

What is meant by ‘validity’?

A

Validity refers to ACCURACY.

29
Q

What is meant by ‘external validity’?

A

External validity refers to the EXTENT TO WHICH YOU CAN GENERALISE THE FINDINGS OF A STUDY TO OTHER SITUATIONS, PEOPLE, SETTINGS AND MEASURES.
(i.e. can you apply the findings of your study to a broader context?).

30
Q

What are the THREE main types of EXTERNAL validity?

A

-Ecological Validity
-Temporal Validity
-Population Validity

31
Q

How could researchers assess the ECOLOGICAL validity of a study?

A

-Researchers could carry out research in a NATURAL ENVIRONMENT that uses a task that accurately reflects REAL LIFE behaviour.

(For example, instead of testing participants memory using artificial wordlists in a laboratory environment, researchers could ask participants to recall specific events from their daily lives and assess their memory accuracy).

32
Q

How could researchers assess the POPULATION VALIDITY of a study?

A

Researchers should carry out research using a SAMPLE that is REPRESENTATIVE of the TARGET POPULATION. If the SAMPLE does not accurately reflect the TARGET POPULATION, then this type of validity will be limited.

33
Q

How could researchers assess the TEMPORAL validity of a study (e.g. Asch’s conformity study)?

A

-Researchers could REPLICATE the study with a new sample of participants in the present day and compare findings of the original study with findings from the replication.
-If there is high consistency of results, then it can be concluded that Asch’s study has high temporal validity - results are still applicable to modern society.

34
Q

What is meant by ‘internal validity’?

A

Internal validity is a measure of WHETHER RESULTS OBTAINED ARE SOLELY AFFECTED BY CHANGES IN THE VARIABLE BEING MANIPULATED (the independent variable) in a cause and effect relationship. (i.e. the researcher can be confident that the manipulation of the independent variable is affecting the dependent variable).

Essentially internal validity is focused on whether the study is measuring what it CLAIMS to be measuring?

35
Q

Give some examples of factors that might impact the internal validity of a research study…

A

-Individual differences can exist between participants, especially in an independent groups design.

-Investigator effects may be given off by researchers which might impact the internal validity of findings.

-Demand characteristics –> Participants may guess the aims of the study and subsequently change their behaviour according to their interpretation of this aim.

-Extraneous variables –> variables in the environment that might have an impact of an IV.

Each of these factors may impact the internal validity of a research study - if there are uncontrolled extraneous variables then these make cause and effect relationships less likely to be established.

36
Q

What are the TWO ways of assessing validity?

A

-FACE VALIDITY
-CONCURRENT VALIDITY

37
Q

(A01) What is ‘face validity’ as a way of assessing validity?

A

Face validity is about whether a TEST APPEARS TO MEASURE WHAT IT IS SUPPOSED TO MEASURE.
-This type of validity is concerned with whether a measure seems RELEVANT and APPROPRIATE for what it is assessing.

-Face validity essentially assesses the degree to which a procedure or test (e.g. a questionnaire) appears effective in terms of its stated aims.
E.g. Does a questionnaire looking at depression effectively measure depressive symptoms in order to provide a trustworthy measure of depression.

38
Q

Who is best to carry out an assessment of this face validity?

A

It is best carried out by someone who is AN EXPERT IN THE FIELD who views the test / procedure and makes a judgment as to whether it seems appropriate.

39
Q

(A01) Outline ‘concurrent validity’ as a way of assessing validity?

A

Concurrent validity is a type of criterion validity (looking at how a test relates to other measures of the same concept).

Concurrent validity MEASURES HOW WELL A NEW TEST COMPARES TO A WELL-ESTABLISHED TEST –> It is demonstrated when a test correlates well with a measure that has previously been validated.

40
Q

Improving the EXTERNAL VALIDITY OF RESEARCH: How can ecological validity (e.g. lab-based research) generally be improved in research?

A

-Ensure your environment reflects and is representative of the real world so findings can be generalised (i.e. carry out in a more natural setting).

41
Q

Improving the EXTERNAL VALIDITY OF RESEARCH: How can mundane realism (task validity) generally be improved in research?

A

-Ensure your task reflects and is representative the type of activities participants may perform in the real world or in their day-to-day lives.

42
Q

Improving the EXTERNAL VALIDITY OF RESEARCH: How can population validity generally be improved in research?

A

Ensure your sample reflects and is representative of the TARGET POPULATION –> Stratified sampling would best reflect this as the composition of the sample reflects the PROPORTIONS OF PEOPLE IN SUB-GROUPS within the target population.

43
Q

Improving the EXTERNAL VALIDITY OF RESEARCH: How can temporal validity generally be improved in research?

A

Temporal validity can generally be improved in research through ensuring that research is repeated frequently over time with participants from the present day to check for consistency of findings.

44
Q

What are the THREE factors that affect internal validity?

A
  1. INVESTIGATORS
  2. PARTICIPANTS
  3. CONFOUNDING VARIABLES
45
Q

Explain some ways that investigator effects can be reduced?

A

Bias in the allocation of participants to groups –> Randomly allocate participants to conditions (this requires no investigator involvement).

Bias in the interpretation of behaviour –> Use a DOUBLE-BLIND PROCEDURE. This is where the participant and the investigator BOTH do not know the aim of the study –> improves internal validity as it means the investigator will not be able to unconsciously communicate the aim to participants (e.g. through tone, facial expressions).

The use of leading questions –> By using OPEN-ENDED QUESTIONS INSTEAD that allow participants to provide their own responses without being influenced by the wording of the question.

Lack of control over procedure (e.g. amount of time groups get to spend on tasks) –> A standardised procedure should be followed to ensure participants have identical experiences (e.g. pre-recorded instructions).

46
Q

What is meant by ‘participant effects’?

A

Where a participant picks up CUES from the study and thus changes their BEHAVIOUR, reducing the internal validity of the study.

47
Q

What are the THREE types of participant effects?

A
  1. Hawthorne Effect
  2. Social Desirability
  3. Demand characteristics
48
Q

What type of observation would reduce the chance of the Hawthorne Effect occurring? Why?

A

COVERT OBSERVATION –> Ppts are not aware that they are being observed so their behaviour is likely to be representative of real life –> Hawthorne effect occurs when participants are aware that they are being observed and so change their behaviour.

49
Q

How would anonymity (e.g. on a questionnaire) reduce the likelihood of Social Desirability? Why?

A

Participants are not identifiable and the researcher cannot trace the response back to a certain individual –> Individuals may feel less need to respond and portray themselves in a socially acceptable and favourable fashion if they are not identifiable to the researcher.

50
Q

How would mild deception / making aims less clear reduce the chance of number 3 occurring?

A

Mild deception prevents participants from guessing the aims of the study and changing their behaviour accordingly –> often deception is necessary to elicit natural, representative behaviour from participants.

51
Q

What experimental design (repeated, matched, independent) would reduce demand characteristics? Explain why…

A

Independent Groups –> Reduces the chance of demand characteristics as a participant only participates in one condition. –> Therefore, they will find it more difficult to identify the differences between conditions and, in turn, guess the aim and/or hypothesis.

52
Q

What is meant by the term ‘participant variables’?

A

Participant variables are the differing individual characteristics of participants in an experiment, sometimes referred to as INDIVIDUAL DIFFERENCES. These are a type of extraneous variable that can impact the internal validity of research.

53
Q

What experimental design (repeated, matched, independent) would reduce participant variables? Why?

A

REPEATED MEASURES DESIGN –> Because results from the SAME participants are compared, individual differences do not affect the results or any subsequent conclusions.

54
Q

How might random allocation reduce the impact of ppt variables?

A

Random allocation greatly decreases systematic error as each ppt has an equal chance of being allocated to each condition –> Individual differences are therefore far less likely to consistently affect results.

55
Q

What is meant by a ‘confounding variable’?

A

A confounding variable is what an EXTRANEOUS VARIABLE becomes when it does have an impact on the DV.

Confounding variables include situational variables, participant variables and investigator effects –> Lab experiments greatly reduce the impact of situational variables.

56
Q

The researchers were interested in the effects of time of day on memory recall. They put all the young people in the morning condition and all the older people in the evening condition.

IV?

DV?

Potential confounding variable?

A

IV: (manipulated) The time of day that participants are tested on their memory recall.

DV: (measured): The accuracy of participants memory recall.

Potential confounding variable = The amount of sleep participants had the previous night, temperature of room they are tested in, age.

57
Q

IMPROVING THE VALIDITY OF SPECIFIC METHODS: How would these modifications improve the internal validity of the study?

UNSTRUCTURED INTERVIEW - 3 researchers were involved in the interviewing of patients about their experience of depression.

Ensuring no leading Qs are used. What would this mean in terms of the validity of the data?

Use of open Qs? Why? What are you not limiting?

Use of a pilot study - what would this address in terms of the Qs?

A

No leading questions –> High validity of the data –> Participants are not influenced by the researcher in how they choose to respond to the question.

Use of open questions –> Not limiting the participants to certain responses (as in closed questions) –> more qualitative data likely to be gathered as follow up questions can be asked of participants to gain further insight and clarification.

Use of a pilot study –> Would address any ambiguities/ errors in phrasing of the questions. These problems can then be rectified before used on participants.

58
Q

QUESTIONNAIRES - A researcher gave participants a questionnaire to investigate the effects of sleep on confidence ahead of exams. All questions regarding their sleeping patterns & confidence were closed questions, all on a scale of 1 - 10 where 1 indicated a low level of confidence and 10 indicated high.

How might the internal validity of questionnaires be improved?

A

Distractor questions –> May prevent participants from guessing the aim of the study and responding accordingly.

Anonymity —> Limits the risk of social desirability (where ppts present themselves in a favourable fashion and a socially acceptable way).

Reverse scoring –> Where certain questions in the questionnaire are phrased in the opposite direction of the construct being measured –> This helps to counteract participant variables such as social desirability bias.

59
Q

OBSERVATION - Bandura, Ross & Ross observed children’s aggression in a laboratory environment by recording what they did every 5 seconds.

How to improve validity…

A

-Internal validity –> Increase the duration of observation intervals from every five seconds to a shorter interval of every 1-2 seconds –> allows for more detailed and precise data to be recorded and prevents key behaviours being missed.

-External validity –> Conduct in a more natural environment to the children (e.g. a playground, in school) to prevent their behaviour being influenced by the controlled, unnatural environment that they are in.

60
Q

LAB RESEARCH - Milgram asked participants to give electric shocks to a ‘learner’ (confederate) via a shock generator, to test how obedient they were.

How to improve validity…

A

Improve task validity –> Ask participants to perform a more realistic task that features in their everyday lives.

Natural setting –> To ensure the participant’s obedience isn’t being influenced by the well-controlled, lab environment that they are in –> Researcher can be confident that behaviour is representative of ppts in their everyday lives.