Chapter 4: Reliability Flashcards
Reliability
Consistency or stability of test scores
Factors that impact reliability
When the test is administered Items selected to be included External distractions (ex- noise) Internal distractions (ex- fatigue) Person administering test Person scoring test
Two components of score
True score (representative of true knowledge or ability) Error score
Systematic error
Error resulting from receiving a different set of instructions for test
Classical test theory equation
Xi=T+E
Xi- obtained score
T- true score
E- error
What measurement error reduces
Usefulness of measurement
Generalizability of test results
Confidence in test results
Content sampling error
Difference between sample of items on test and total domain of items
How good sampling affects error
Reduces it
Largest source of measurement error
Content sampling error
Time sampling error
Random fluctuations in performance over time
Can be due to examinee (fatigue, illness, anxiety, maturation) or due to environment (distractions, temperature)
Inter-rater differences
When scoring is subjective, different scorers may score answers differently
Clerical errors
Adding up points incorrectly
Reliability (mathematic definition)
Symbol: rxx
Ratio of true score variance to total score variance (number from 0 to 1, where 0 is total error and 1 is no error)
Reliability equation
rxx= (sigma^2T)/(sigma^2X)
Reliability’s relation to error
Greater the reliability, the less the error
What reliability coefficients mean
rxx of 0.9: 90% of score variance is due to true score variance
Test-retest reliability
Administer the same test on 2 occasions
Correlate the scores from both administrations
Sensitive to sampling error
Things to consider surrounding test-retest reliability
Length of interval between testing
Activities during interval (distraction or not)
Carry-over effects from one test to next
Alternate-form reliability
Develop two parallel forms of test
Administer both forms (simultaneously or delayed)
Correlate the scores of the different forms
Sensitive to content sampling error (simultaneous and delayed) and time sampling error (delayed only)
Things to consider surrounding alternate-form reliability
Few tests have alternate forms
Reduction of carry-over effects
Split-half reliability
Administer the test
Divide it into 2 equivalent halves
Correlate the scores for the half tests
Sensitive to content sampling error
Things to consider surrounding split-half reliability
Only 1 administration (no time sampling error)
How to split test up
Short tests have worse reliability
Kuder-Richardson and coefficient (Cronbach’s) alpha
Administer test
Compare each item to all other items
Use KR-20 for dichotomous answers and Cronbach’s alpha for any type of variable
Sensitive to content sampling error and item heterogeneity
Measures internal consistency
Inter-rater reliability
Administer test
2 individuals score test
Calculate agreement between scores
Sensitive to differences between raters