Test Development Flashcards
(14 cards)
Biased test item
An item that favors one particular group of examinees in relation to another when differences in group ability are controlled
Computerized adaptive testing (CAT)
An interactive, computer-administered test-taking process wherein items presented to the testtaker are based in part on the testtaker’s performance on previous items
Guttman scale
Named for its developer, a scale wherein items range sequentially from weaker to stronger expressions of the attitude or belief being measured
Ipsative scoring
An approach to test scoring and interpretation wherein the testtaker’s responses and the presumed strength of a measured trait are interpreted relative to the measured strength of other traits for that testtaker
item Analysis
A general term to describe various procedures, usually statistical, designed to explore how individual test items work as compared to other items in the test and in the context of the whole test (e.g., to explore the level of difficulty of individual items on an achievement test or the reliability of a personality test)
Item characteristic curve (ICC)
A graphic representation of the probabilistic relationship between a person’s level on a trait (or ability or other characteristic being measured) and the probability for responding to an item in a predicted way; also known as a category response curve, an item response curve, or an item trace line
Item difficulty index
In achievement or ability testing and other contexts in which responses are keyed correct, a statistic indicating how many testtakers responded correctly to an item; in contexts where the nature of the test is such that responses are not keyed correct, this same statistic may be referred to as an item-endorsement index,
Item discrimination index
A statistic designed to indicate how adequately a test item discriminates between high and low scorers
Item reliability index
A statistic designed to provide an indication of a test’s internal consistency; the higher the item reliability index, the greater the test’s internal consistency
Item validity index
statistic indicating the degree to which a test measures what it purports to measure
Likert scale
Named for its developer, a summative rating scale with five alternative responses ranging on a continuum from, for example, “strongly agree” to “strongly disagree,”
Pilot work
the preliminary research surrounding the creation of a prototype test; a general objective of pilot work is to determine how best to measure, gauge, assess, or evaluate the targeted construct(s)
Test construction
A stage in the process of test development that entails writing test items (or rewriting or otherwise revising existing items), as well as formatting items, setting scoring rules, and otherwise designing and building a test
Test development
An umbrella term for all that goes into the process of creating a test