Test Realibility Flashcards
(43 cards)
is an index of reliability, a proportion that indicates the ratio between the
true score variance on a test and the total variance
Reliability coefficient
- a score on an ability test reflects not only the testtaker’s true score on the ability being
measured but also error
Classical Test Theory (True Score Theory)
3 Sources of Error Variance
Test Construction
Test Administration
Test Scoring and Interpretation
variance is attributed to item/content sampling
Test Construction
test environment, testtaker variables, examiner-related variables are factors that may
influence testtaker’s attention or motivation
Test Administration
technical glitches, subjectivity of scorer, human error, etc
Test Scoring and Interpretation
§ obtained by correlating pairs of scores from the same people on two different administrations of the same test
§ appropriate when evaluating a test measuring a construct that is relatively stable over time (e.g. personality)
§ coefficient of stability
§ source of error variance: ti
Reliability Estimates (STABILITY)
TEST-RETEST RELIABILITY ESTIMATE
§ two test administrations with the same group of test takers
§ coefficient of equivalence
Reliability Estimates (EQUIVALENCE)
PARALLEL-FORMS and ALTERNATE-FORMS RELIABILITY ESTIMATES
a test exist when, for each versions of the test, the means and
variances of observed test scores are equal.
Parallel-forms
a test are typically designed to be equivalent/identical with
respect to variables such as content and level of difficulty
Alternate-forms
§obtained by correlating two pairs of scores obtained from
equivalent halves of a single test administered once
SPLIT-HALF RELIABILITY ESTIMATE
◦ used to estimate internal consistency reliability from a correlation of two
halves of a test (either lengthened or shortened)
Spearman-Brown formula
Full meaning of KR 20 & 21
KUDER-RICHARDSON FORMULA 20 & 21
used to determine the inter-item consistency of
dichotomous items - items that can be scored right or wrong (e.g.
Multiple-choice, Yes/No, True/False, Agree/Disagree)
KR-20
items that can be scored right or wrong (e.g.
Multiple-choice, Yes/No, True/False, Agree/Disagree)
dichotomous items -
may be used if all the test items have approximately the
same degree of difficulty
KR-21
§most accepted and widely used reliability estimate
§Provides a measure of reliability from a single test administration
§developed by Lee Joseph Cronbach that’s why it is also called
Cronbach’s alpha
appropriate for use on tests containing nondichotomous items
COEFFICIENT ALPHA
Coefficient alpha developed by __________ that’s why it is also called _________
Lee Joseph Cronbach
Cronbach’s alpha
appropriate for use on tests containing ________ items
(Strongly Disagree - Strongly Agree
nondichotomous
§degree of agreement or consistency between two or more scorers
with regard to a particular measure
§scorers must have sufficient training in standardized scoring
§source of error: scoring criteria
§coefficient of inter-scorer reliability
INTER-SCORER RELIABILITY ESTIMATE
Using and Interpreting a Reliability Coefficient
When purchasing tests:
üNever buy any form of assessment/measurement where there is
no reliability
coefficient or where it is below 0.7
Using and Interpreting a Reliability Coefficient
When purchasing tests:
üPersonality and similar measures: ___________ is often
recommended as minimum
0.6 to 0.8 although above 0.7
Using and Interpreting a Reliability Coefficient
When purchasing tests:
üAbility, aptitude, IQ and other forms of reasoning tests should have coefficients
___________has been recommended as an excellent value. Where the
intention is to compare people’s scores, such as when selecting people for a job,
values ______ should be the aim.
above 0.8. Above 0.85
& above 0.85
Using and Interpreting a Reliability Coefficient
When purchasing tests:
üThe sample size used for calculation of reliability should never be _____
below 100