Evaluating Selection Techniques and Decisions Flashcards
_____ is the extent to which a score from a selection measure is stable and free from error. If a score from a measure is not stable or error-free, it is not useful.
Reliability
_____ is determined in four ways: test-retest reliability, alternate-forms reliability, internal reliability, and scorer reliability.
Test reliability
_____: The test scores are stable across time and not highly susceptible to such random daily conditions as illness, fatigue, stress, or uncomfortable testing conditions.
temporal stability
Test-Retest Reliability
Typical time intervals between test administrations range from _____. Usually, the longer the time interval, the lower the reliability coefficient.
The time interval should be long enough so that the specific test answers have not been memorized, but short enough so that the person has not changed significantly.
3 days to 3 months
The typical test-retest reliability coefficient for tests used in industry is _____
.86
Alternate-Forms Reliability
This _____ of test-taking order is designed to eliminate any effects that taking one form of the test first may have on scores on the second form.
counterbalancing
The average correlation between alternate forms of tests used in industry is _____.
In addition to being correlated, two forms of a test should also have the same mean and standard deviation.
.89
A third way to determine the reliability of a test or inventory is to look at the consistency with which an applicant responds to items measuring a similar dimension or construct (e.g., personality trait, ability, area of knowledge). The extent to which similar items are answered in similar ways is referred to as internal consistency and measures item stability.
Internal Reliability
Another factor that can affect the internal reliability of a test is item homogeneity. That is, do all of the items measure the same thing, or do they measure different constructs? The more homogeneous the items, the higher the _____.
internal consistency
Internal Reliability
The _____ method is the easiest to use, as items on a test are split into two groups. Usually, all of the odd-numbered items are in one group and all the even-numbered items are in the other group.
split-half
The scores on the two groups of items are then correlated. Because the number of items in the test has been reduced, researchers have to use a formula called _____ prophecy to adjust the correlation.
Spearman-Brown
_____ is used for tests containing dichotomous items (e.g., yes/no, true/ false), whereas the coefficient alpha can be used not only for dichotomous items but for tests containing interval and ratio items such as five-point rating scales.
K-R 20
_____ is an issue in projective or subjective tests in which there is no one correct answer, but even tests scored with the use of keys suffer from scorer mistakes.
Scorer Reliability
_____ is the degree to which inferences from scores on tests or assessments are justified by the evidence.
Validity
One way to determine a test’s validity is to look at its degree of _____—the extent to which test items sample the _____ that they are supposed to measure.
In industry, the appropriate content for a test or test battery is determined by the job analysis.
Content Validity
Another measure of validity is _____, which refers to the extent to which a test score is related to some measure of job performance called a _____.
Commonly used criteria include supervisor ratings of performance, actual measures of performance (e.g., sales, number of complaints, number of arrests made), attendance (tardiness, absenteeism), tenure, training performance (e.g., police academy grades), and discipline problems.
criterion validity
With a _____ design, a test is given to a group of employees who are already on the job. The scores on the test are then correlated with a measure of the employees’ current performance.
concurrent validity
With a _____ design, a test is given to a group of employees who are already on the job. The scores on the test are then correlated with a measure of the employees’ current performance.
concurrent validity
With a _____ design, the test is administered to a group of job applicants who are going to be hired. The test scores are then compared with a future measure of job performance.
predictive validity
Why is a concurrent design _____ than a predictive design? The answer lies in the homogeneity of performance scores. In a given employment situation, very few employees are at the extremes of a performance scale. Employees who would be at the bottom of the performance scale either were never hired or have since been terminated. Employees who would be at the upper end of the performance scale often get promoted. Thus, the restricted range of performance scores makes obtaining a significant validity coefficient more difficult.
weaker