Reliability and validity Flashcards
(21 cards)
Test-retest reliability
Consistency across time
Internal consistency
Consistency across items
Inter-rater reliability
Consistency across researchers
What is used to assess test-retest reliability
ICC
What is used to measure internal consistency
Chronbach’s alpha or McDonald’s omega
Mcdonald’s omega is the preferred option
What are they advantages of Mcdonald’s omega over Cronbach’s alpha
Validity
The extent to which the scores actually represent the variable they are intended to
Which type of ICC do we use when writing up the output
The ICC2
ICC
Intraclass correlations - look at the absolute agreement between variables
Close to 1 - good - most variation comes from difference between individuals
Close to 0 - bad - most variation comes from measurement error or differences within individuals
Values for ICC
0.5 to 0.75 = moderate reliability
0.75 to 0.9 = good reliability
> 0.9 = excellent reliability
Internal consistency
The consistency of people’s responses across the items on a multiple-item measure
In general, all items on a questionnaire should reflect the same underlying construct. Therefore, people’s scores on a set of items should be correlated with each other
Split-half method
Can be used to assess internal consistency
This method involves splitting the items on a questionnaire into two halves with each half measuring the same elements but in slightly different ways.
Then a score is computed for each set of items and the relationship between the two sets of scores is examined
If a scale is very reliable, a person’s score on half of the scale should be the same or similar to their score on the other half.
A split half correlation of +.80 is generally considered good internal consistency
What is the problem with the split-half method
There are several ways in which the set of data can be split into two and so the results could be a product of the way in which the data were split
e.g. someone compares the first half of the questionnaire to the second half and finds out the internal consistency is bad so they decide to do odds and evens comparison and find out its good - they publish that and then everyone uses that questionnaire - flexibility can be an issue
What is the most common measure of internal consistency
Cronbach’s a
Cronbach’s alpha - the extent to which different items on the same test correlate with each other
Alpha coefficients
Range from 0 to 1
The higher the score, the more reliable the scale is
A value of +.70 is generally taken to indicate good internal consistency
Problems with Cronbach’s alpha
It’s a lower bound estimate - meaning it gives the lowest estimate of reliability (it’s pessimistic)
Assumption of tau-equivalence
More questions = higher alpha
Tau-equivalence
The assumption that all items have the same factor or component loadings - this is unlikely and can reduce alpha estimates by up to 11%
McDonald’s Omega (w)
Works in very much the same way as Cronbach’s alpha but does not require tau equivalence, so it works even when items vary in their contribution to the total score
Omega Hierarchal
Assesses the extent to which variance on a measure is due to a general factor (g)
For example, an intelligence measure may have discrete factors (spatial intelligence, emotional intelligence etc) but should also tap into a general factor (intelligence)
Omega total
Assesses reliability for all factors (general and other factors)
How is inter-rater reliability assessed
Using Cronbach’s alpha and ICC’s when the judgements are quantitative or Cohen’s k when the judgements are categorical