Lec2 - Ch5 Classical test theory models Flashcards
Classical Test Theory Models and Conceptual basis (49 cards)
Reliability
- what is it, in regards to tests and scores?
- how much noise is there in a psychological test?
- it is a property of test scores, not of the test itself
> a test might have different psychometric properties for different kinds of respondents (i.e. it could be reliable for an age range but not the other)
> therefore, each set score has some level of reliability
what is the COTAN
- committee that evaluates psychological tests in the Netherlands
- it is part of the NIP (Netherlands Institute of Psychology)
how does the COTAN differentiate tests?
- test used for high-impact inferences at individual level
> very important; big consequences if mistake
> e.g. personnel selection, diagnosis of learning disabilities… - test used for less impact inference at individual level
> descriptive use, less consequences
> e.g. study/therapy progress, career choice test, … - test used at group level
> e.g. costumer team satisfaction, student evaluation, comparing groups
high-impact inferences tests
- reliability rules
- good: 0.9 or larger
- sufficient: between 0.8 and 0.9
- insufficient: smaller than 0.8
less impact inferences tests
- reliability rules
- good: 0.8 or larger
- sufficient: between 0.7 and 0.8
- insufficient: smaller than 0.7
group level tests
- reliability rules
- good: 0.7 or larger
- sufficient: between 0.6 and 0.7
- insufficient: smaller than 0.6
what is the aim of behavioural science?
- it strives to quantify the degree to which differences in one variable are associated with differences in other variables
- these differences have to be measured accurately, hence reliability
What are the assumptions that testing is based on?
- behavioural differences among people exist
- differences have important implications
- they can be measured with precision
what is Classical test theory?
- it is a measurement theory
- it explains reliability and it shows how to measure it
What is the central idea of classical test theory?
Every test taker has a true score on a test
what is reliability according to the classical test theory?
- Extent to which differences in respondent’s observed scores are consistent with differences in true scores
- it derives from observed scores, true scores and measurement error
What are the two main assumptions of Classical Test Theory?
- observed scores are true scores plus measurement error
- measurement error is random (affects everybody but it is not systematic)
> likely to increase or decrease any particular score at random - Xo = Xt + Xe
What are the implications of the assumptions of classical test theory? (consequences)
- mean of the measurement error is equal to zero
> because a non-zero mean would make the error systematic (the error cancels itself out) - the correlation between true score and error is equal to zero
> because the mean is zero for all error values - observed score variance = true score variance + error variance
Observed scores
- value obtained from measuring a characteristic in a sample of individuals
- true score + measurement error
> (it can be seen as a composite score)
True score
- Score that you would get using a perfect measurement instrument
- “real amount” of the characteristic you are measuring
- average score that a participant would obtain if they completed the test an infinite number of times
Measurement Error
- Influences that create random noise in the observed score
- it creates inconsistencies between true and observed scores
> e.g. distraction, not precise meter, .. - it is impossible to know all the sources of measurement error and noise
- we must differenciate to which extent differences in scores are attributable to real differences in the trait or to random external influences
what is the mean measurement error in a test?
- always 0
- it is independent of the individual’s true scores
- inflates or deflates respondents’ scores randomly, therefore it cancels itself out
- error scores are uncorrelated with true scores (r=0)
- see picture 1 for effects of measurement error
Variance of error scores
- how to calculate it
- what it represents
- see picture 2
- it represents the degree to which error affected different people in different ways
> high degree of error variance indicates the potential for poor measurement
how do you calculate the variance of observed scores?
- see picture 3
- variance of observed scores = variance of true scores + variance of error scores
> variability in observed scores will be larger than variability in true scores
What are signal and noise?
- Signal: true score variance
- Noise: measurement error variance
What are the ways to think about reliability?
IMPORTANT!
- see picture 6
- Proportion of variance
> ratio of true score variance to observed score variance
> lack of error variance - Shared variance
> squared correlation between observed scores and true scores
> lack of correlation between observed scores and error scores
Proportion of Variance
1 - Ratio of true score variance to observed score variance
- see picture 4
- true score variance is the signal that we want to detect
- error variance is the noise obscuring the signal
- reliability = signal / (signal+noise)
> signal+noise = observed score variance
What does it mean to obtain a reliability of .48?
- 48% of the differences among people’s observed scores can be attributed to differences among their true levels
- reliability ranges from 0 to 1; if it is 0, it means that the true score variance is also 0, which is impossible in a real world situation
Proportion of variance
2- reliability as lack of measurement error
- see picture 5 and 7
- reliability: degree to which error variance is minimal in comparison with the vairance of observed tests
- reliability = 1- ( noise / (noise+signal) )
- the reliability is high when the error variance is small and the observed score variance is large