Reliabiity, Validity Flashcards
(33 cards)
assumes that each person has a true score that would be obtained if there were no errors in measurement
classical test score theory
assumes that the items that have been selected for any one test are just a sample of items from an infinite domain of potential itesms
domain sampling theory
process of choosing test items that are appropriate to the content domain of the test
domain sampling
another central concept in classical test theory wherein it considers the problems created by using a limited number of items to represent a larger and more complicated construct
domain sampling model
using ____, the computer is used to focus on the range of item difficulty that helps assess an individual’s ability level
item response theory
degree to which scores from a test are stable and results are consistent
reliability
ratio of the variance of the true scores on a test to the variance of the observed scores
reliability coefficient
test reliability is usually estimated in one of three ways:
- test-retest method
- parallel forms method
- internal consistency method
consistency of the test results are considered when the test is administered on different occasions
test-retest method
test across different forms of the test are evaluated
parallel forms method
performance of people on similar subsets of items selected from the same form of measure is examined
internal consistency
occurs when the first testing session influences scores from the second session
carryover effect
compares two equivalent forms of a test that measure the same attribute
parallel forms / equivalent forms reliability
determined by dividing the total set of items relating to a construct of interest into halves and comparing the results obtained from the two subsets of items thus created
split-half reliability
measure of internal consistency; considered to be a measure of scale reliability
coefficient alpha or cronbach’s alpha
used to estimate the reliability of binary measurements
kuder and rischardshon formula 20
takes into account chance agreement
kappa statistics
allows you to estimated what the correlation between the two halves would have been if each half had been the length of the whole test
spearman-brown formula
best method for assessing the level of agreement among several observers
kappa statistic
agreement between a test score or measure and the quality it is believed to measure
validity
3 types of evidence:
- construct-related
- criterion-related
- content-related
mere appearance that a measure has validity
face validity
logical rather than statistical
content validity
describes the failure to capture important components of a construct
construct underepresentation