Reliability and Validity Flashcards by Fiza Ali

What is a psychometric test?

A standardised test that uses psychological measurements to quantify a person’s ability, strengths or characteristics

How well did you know this?

Not at all

Perfectly

What does a psychometric test consist of?

One or more stimuli that people respond to. Responses can be overt (e.g., key press) or covert (e.g., skin conductance response)

How well did you know this?

Not at all

Perfectly

How are psychometric tests standardised?

Data is collected from from a large number of people to establish norms, including cut-offs. This allows identification of individuals outside or inside the population norms.

How well did you know this?

Not at all

Perfectly

What are two key properties of a good psychometric test?

Reliability (internal properties of the scale) and validity (external properties of the scale)

How well did you know this?

Not at all

Perfectly

What is reliability?

Consistency/stability of a measure across:
- Time
- Setting
- Individuals
‘Quality’ of measurement (the extent to which you can measure a person’s ‘true’ score each time)

How well did you know this?

Not at all

Perfectly

What is True Score Theory?

A person’s ‘true’ score reflects their genuine ability, characteristics, or potential. Every observed score is made up o the true score plus some measurement error.

How well did you know this?

Not at all

Perfectly

What are the two sources of variability (error) in observed scores?

Variability in ‘true’ scores within a population (individual differences)
Fluctuations in measurement error (systematic and random)

How well did you know this?

Not at all

Perfectly

What are the two types of measurement error of reliability?

Systematic error
Random error

How well did you know this?

Not at all

Perfectly

How does systematic error affect reliability?

Consistently affects measurement (bias)
- Fire alarm during exam
- Race/gender bias on the test items
- Driving errors equipment has faulty controller

Systematic errors alter the mean of the data: shift
(everyone shifts in the same direction)

How well did you know this?

Not at all

Perfectly

How does random error affect reliability?

Random variations are not consistent across the sample (noise/chance)
- Test takers might be hungry, tired, nervous, etc., but not everyone taking the test might be in the same state
Random variations increase the variability of the data
(+ and - to the true score)

How well did you know this?

Not at all

Perfectly

What is reliability determined by?

how much error variance is in the measurement

How well did you know this?

Not at all

Perfectly

How do we calculate reliability?

variability of true score / variability of observed score

*greater denominator = less reliable

How well did you know this?

Not at all

Perfectly

What are the types of reliability?

Test-retest
Alternate/parallel forms
Internal consistency (including co-efficient alpha, split-half/inter-item reliability, and item-total reliability)
Inter-rater agreement

How well did you know this?

Not at all

Perfectly

What are the types of validity?

Construct validity (including content and face validity)
Criterion validity (including predictive, known groups, convergent and discriminant validity)

How well did you know this?

Not at all

Perfectly

What is predictive validity?

The extent to which data from a measure can predict something it should theoretically be able to predict. For example, does a high extraversion score predict the number of friends someone has?

How well did you know this?

Not at all

Perfectly

What is convergent validity?

Study These Flashcards

The degree to which multiple measures of the same construct show similar results. For example, do self-report and clinician-rated measures of depression show similar scores?

What is discriminant validity?

Study These Flashcards

The extent to which a measure does NOT relate to constructs it should NOT relate to. For example, a measure of verbal validity should not be strongly correlated with measures of athletic ability.

What are the two types of errors in measurement?

Study These Flashcards

Systematic error (bias)
Random error (noise/chance)

What is systematic error?

Study These Flashcards

Error that consistently affects measurement. Examples include a fire alarm during an exam or race/gender bias in test items. Systematic errors shift the mean of the data in the same direction fr everyone

What is random error?

Study These Flashcards

Random variations in scores that don’t have a consistent effect across the sample. Examples include test-takers being hungry, tired or nervous. Random error increases the variability of the data.

What is test-retest reliability?

Study These Flashcards

A type of reliability assessed by administering the same test to the same people on multiple occasions and calculating the correlation between scores. A higher correlation indicates higher reliability.

What considerations are important for test-retest reliability?

Study These Flashcards

The time interval between tests is important. Longer intervals allow more room for natural change, while shorter intervals can lead to carryover effects. Test-retest reliability is only useful for stable characteristics.

What is alternate/parallel forms of reliability?

Study These Flashcards

A type of reliability assessed by developing multiple versions of a test, administering each version to the same participants at multiple times, and correlating the results. A higher correlation indicates better reliability.

What is split-half reliability?

Study These Flashcards

A type of reliability assessed by administering a test to participants, splitting the test in half, and computing the correlation between the two halves.

What are the problems with split-half reliability?

The way that the test is split is important, as different splits can yield different reliability values. Reliability is also reduced because the number of items in each half is smaller.

What is coefficient alpha?

A measure of internal consistency reliability that assesses the correlation of each test item with other items. The most common method is Cronbach's alpha.

What is a problem with coefficient alpha?

It can be sensitive to test size, as more items can lead to higher reliability estimates even if the items are poorly inter-correlated.

What are the two main types of validity?

Construct Validity (how well a measure reflects the intended construct) and criterion validity (how well a measure relates to concrete, observable criteria)

What are the subtypes of construct validity?

Content validity Face validity

What is content validity?

The extent to which the content of each test item measures the intended construct

What is face validity?

A superficial measure of whether the test appears to measure the intended construct. It can be important because it can affect how respondents approach a test.

What is known-groups validity?

The extent to which a measure differentiates between groups who should theoretically perform differently on it

What are some good practices for constructing valid test items?

Use ambiguous language Avoid items that might cause response bias Use a blend of positively and negatively keyed items Ensure items assess all aspects of a construct

What is social desirability and why is it important to consider in test reconstruction?

The tendency for people to want to present themselves in a positive light. Items that no one wants to rate themselves high r low on tend to be poor items because they fail to capture the range of variability in the construct. These items can be high in reliability but low in validity.

Reliability and Validity Flashcards

(34 cards)