CHAPTER 7: UTILITY Flashcards
(29 cards)
It refers to how useful the test is, or the usefulness or practical value of testing to improve efficiency.
Utility (or Test Utility)
It refers to anything from a single test to a large-scale testing program that employs a battery of tests.
Testing
It refers to the reliability (consistency) and validity (accuracy) of a test. A test is considered psychometrically sound when it produces reliable and valid results for a particular purpose. One of the factors of a test’s utility that referred to as a core foundation of a test’s utility. It ensures that the test consistently measures what it claims to measure (reliability) and accurately assesses the intended construct (validity). While reliability and validity tell us about the technical quality of the test, utility goes a step further to reflect the practical value of using the test to make effective, cost-efficient decisions. Even a test with strong psychometric soundness may lack real-world utility if it’s not feasible or appropriate in practice, due to misuse, impractical administration, or external factors (like tampering, in the case of the cocaine sweat patch). A test must be valid to be useful, but a valid test is not always useful—utility depends not just on psychometric soundness, but also on how well the test performs in actual settings.
Psychometric Soundness
It refers to the disadvantages, losses, or expenses, both economic and non-economic, associated with using (or not using) a test. It refers as one of the test’s utility factors that are a crucial consideration in evaluating a test’s utility. While many costs are financial (e.g., purchasing tests, staff salaries, scoring services), others are non-economic and potentially more serious, such as the loss of life, reputation, or public trust due to poor or absent testing. Failing to use a valid and reliable test (e.g., for pilot hiring or child abuse detection) might save money initially but can lead to far greater long-term losses, including safety risks, legal liability, or missed diagnoses. On the other hand, using a more expensive but high-utility test may be worth the investment if it leads to better outcomes.
_____ is not just about money. In test utility, it includes potential harms, risks, and missed opportunities that result from poor testing decisions or inadequate tools.
Costs
It refers to the gains, advantages, or profits—either economic (e.g., financial returns, increased productivity) or non-economic (e.g., improved morale, safety, or reputation)—that result from administering and using a test. It is not just profit. In test utility, it includes the full range of positive outcomes—economic or not—that justify the cost and effort of testing.
Benefits
Benefits are what you gain by using a test—these can be tangible (like profit increases) or intangible (like improved public safety or institutional reputation). While some benefits are easy to measure financially, many important ones are non-economic, such as:
Increased worker or student performance
Better training efficiency
Reduced accidents or turnover
Improved institutional credibility or safety
Better societal outcomes (e.g., proper psychiatric hospitalization)
It may be broadly defined as a family of techniques that entail a cost–benefit analysis designed to yield information relevant to a decision about the usefulness and/or practical value of a tool of assessment. It is an umbrella term covering various possible methods, each requiring various kinds of data to be inputted and yielding various kinds of output.
Utility Analysis
It involves creating expectancy tables or charts that illustrate the likelihood of a test-taker achieving a certain outcome on a criterion measure based on their test scores. These tables are derived from scatterplots of test data and provide decision-makers with probabilities that individuals scoring within specific ranges on a test will perform at designated levels (e.g., “passing,” “acceptable,” or “failing”) on a related criterion.
Expectancy data
These tables estimate the improvement in selection decisions when a test is added to the selection process. They consider factors like the test’s validity, the selection ratio (the proportion of applicants selected), and the base rate (the success rate without the test).
Taylor-Russell Tables
These provide an index of the added value of a test by comparing the average performance of selected individuals against those not selected, helping to determine the test’s effectiveness in enhancing performance outcomes.
Naylor-Shine Tables
This formula calculates the monetary gain from using a particular selection method, factoring in elements such as the number of applicants, the test’s validity, the standard deviation of job performance, the average tenure, and the cost of testing.
Brogden-Cronbach-Gleser Formula
It refers to an estimate of the benefit (monetary or otherwise) of using a particular test or selection method.
Utility Gain
It refers to an estimated increase in work output.
Productivity Gain
It helps determine if a test is worth using by analyzing its utility in decision-making, like whether it improves job selection outcomes over non-testing methods.
Decision Theory
Utility estimates often assume a limitless supply of applicants, but in reality, some jobs may have a small pool due to unique skills or sacrifices required.
Economic conditions impact the number of applicants, with high unemployment increasing the pool.
Many models assume all selected applicants will accept job offers, which can overestimate the utility of the test, as top candidates may be in demand elsewhere. Adjustments to estimates may be needed.
The Pool of Job Applicants
Utility analysis approaches are applied across different job complexities, but the effectiveness of these methods may vary depending on the job’s complexity.
More complex jobs show a greater range in how well individuals perform, leading to debates on whether utility models apply equally across job types.
The Complexity of the Job
It is a (usually numerical) reference point derived as a result of a judgment and used to divide a set of data into two or more classifications, with some action to be taken or some inference to be made on the basis of these classifications. These are reference points used to classify test scores for decisions.
Cut Scores
It is also referred to as a norm-referenced cut score, which may be defined as a reference point in a distribution of test scores used to divide a set of data into two or more classifications—that is set based on norm-related considerations rather than on the relationship of test scores to a criterion.
Relative Cut Scores
It is a reference point, in a distribution of test scores, used to divide a set of data into two or more classifications, that is typically set with reference to a judgment concerning a minimum level of proficiency required to be included in a particular classification. It may also be referred to as absolute cut scores, based on minimum proficiency levels (e.g., passing a driver’s test).
Fixed Cut Scores
It refers to the use of two or more cut scores with reference to one predictor for the purpose of categorizing test takers. It basically uses more than one cutoff for categorizing applicants based on various predictors.
Multiple Cut Scores
It involve using several stages in selection where candidates must meet minimum standards at each stage to advance.
Multiple Hurdles
An assumption made that high scores on one attribute can, in fact, “balance out” or compensate for low scores on another attribute.
Compensatory Model of Selection
This method for setting fixed cut scores can be applied to personnel selection tasks as well as to questions regarding the presence or absence of a particular trait, attribute, or ability.
Angoff Method
It entails the collection of data on the predictor of interest from groups known to possess, and not to possess, a trait, attribute, or ability of interest. Based on an analysis of this data, a cut score is set on the test that best discriminates between the two groups’ test performance.
Known Groups Method