Flashcards in Study Guide 12: Criterion Related Validity Deck (17):
Area under the Curve (AUC)
Graph of true positive rate (sensitivity) vs. False positive rate (1-specificity). Interpreted as an estimate of the probability that a randomly chosen depressed person will have a higher depression score than a randomly chosen non-depressed person. AUC of 0.5 = line of no information, recommend: AUC ≥ 0.8.
Proportion of current employees who are successful, but have not been selected using any test
Binomial effect size display (BESD)
a tool for reporting the magnitude of effect size
Concurrent evidence: validity evidence demonstrating the degree to which criterion and test that are measured at the same time correlate.
Criterion (or criterion variable):
a measure of some attribute or outcome that is of primary interest.
validity evidence demonstrating the correlation between performance on the test with performance on relevant criterion.
Negative predictive value (NPV):
TN (TN + TP) - % of people who are truly not depressed (according to the criterion) out of those that the scale identified as non-depressed
Positive predictive value (PPV)
TP/(FP + TP) - % of individuals who are truly depressed (according to the criterion) of those that the scale identified as depressed
- influenced by prevalence of X in population
- As prevalence , PPV
- small differences in specificity levels can strongly influence PPVs
- changes in sensitivity have little impact on PPV
validity evidence demonstrating the degree to which criterion and test that are measured at different times correlate.
the proportion of the population or sample found to have the condition.
Receiver operating characteristic (ROC) curve
graph sensitivity and specificity fro all of the possible scores of a scale by plotting true positives (sensitivity) versus false positives (1 – specificity).
# of applicants hired/# of applicants selected. All else being equal, the smaller the selection ratio, the more useful the test.
TP/(FN + TP). The proportion of actual positives which are correctly identified as such. How sensitive a measure is at detecting the condition
TN/(TN + FP). The proportion of actual negatives which are correctly identified. How specific a measure is at ruling out those without the condition.
Standard error of estima
an estimate of the amount of error to be expected in the predicted criterion score, speaks to the validity of the measure. Whereas, SEM speaks to the reliability of the measure.
table demonstrating the connection between the validity of a test and the likelihood of the test resulting in a successful selection of a candidate.
frames validity in terms of a cost versus benefit analysis of test use. Is a test worth using? Does it outweigh the costs?