Reliability and Validity Flashcards
(34 cards)
What is a psychometric test?
A standardised test that uses psychological measurements to quantify a person’s ability, strengths or characteristics
What does a psychometric test consist of?
One or more stimuli that people respond to. Responses can be overt (e.g., key press) or covert (e.g., skin conductance response)
How are psychometric tests standardised?
Data is collected from from a large number of people to establish norms, including cut-offs. This allows identification of individuals outside or inside the population norms.
What are two key properties of a good psychometric test?
Reliability (internal properties of the scale) and validity (external properties of the scale)
What is reliability?
Consistency/stability of a measure across:
- Time
- Setting
- Individuals
‘Quality’ of measurement (the extent to which you can measure a person’s ‘true’ score each time)
What is True Score Theory?
A person’s ‘true’ score reflects their genuine ability, characteristics, or potential. Every observed score is made up o the true score plus some measurement error.
What are the two sources of variability (error) in observed scores?
- Variability in ‘true’ scores within a population (individual differences)
- Fluctuations in measurement error (systematic and random)
What are the two types of measurement error of reliability?
Systematic error
Random error
How does systematic error affect reliability?
Consistently affects measurement (bias)
- Fire alarm during exam
- Race/gender bias on the test items
- Driving errors equipment has faulty controller
Systematic errors alter the mean of the data: shift
(everyone shifts in the same direction)
How does random error affect reliability?
- Random variations are not consistent across the sample (noise/chance)
- Test takers might be hungry, tired, nervous, etc., but not everyone taking the test might be in the same state
- Random variations increase the variability of the data
- (+ and - to the true score)
What is reliability determined by?
how much error variance is in the measurement
How do we calculate reliability?
variability of true score / variability of observed score
*greater denominator = less reliable
What are the types of reliability?
Test-retest
Alternate/parallel forms
Internal consistency (including co-efficient alpha, split-half/inter-item reliability, and item-total reliability)
Inter-rater agreement
What are the types of validity?
Construct validity (including content and face validity)
Criterion validity (including predictive, known groups, convergent and discriminant validity)
What is predictive validity?
The extent to which data from a measure can predict something it should theoretically be able to predict. For example, does a high extraversion score predict the number of friends someone has?
What is convergent validity?
The degree to which multiple measures of the same construct show similar results. For example, do self-report and clinician-rated measures of depression show similar scores?
What is discriminant validity?
The extent to which a measure does NOT relate to constructs it should NOT relate to. For example, a measure of verbal validity should not be strongly correlated with measures of athletic ability.
What are the two types of errors in measurement?
Systematic error (bias)
Random error (noise/chance)
What is systematic error?
Error that consistently affects measurement. Examples include a fire alarm during an exam or race/gender bias in test items. Systematic errors shift the mean of the data in the same direction fr everyone
What is random error?
Random variations in scores that don’t have a consistent effect across the sample. Examples include test-takers being hungry, tired or nervous. Random error increases the variability of the data.
What is test-retest reliability?
A type of reliability assessed by administering the same test to the same people on multiple occasions and calculating the correlation between scores. A higher correlation indicates higher reliability.
What considerations are important for test-retest reliability?
The time interval between tests is important. Longer intervals allow more room for natural change, while shorter intervals can lead to carryover effects. Test-retest reliability is only useful for stable characteristics.
What is alternate/parallel forms of reliability?
A type of reliability assessed by developing multiple versions of a test, administering each version to the same participants at multiple times, and correlating the results. A higher correlation indicates better reliability.
What is split-half reliability?
A type of reliability assessed by administering a test to participants, splitting the test in half, and computing the correlation between the two halves.