Test Development Flashcards by Bue Tu

is an umbrella term for all that goes into the process of creating a test

Test Conceptualization

How well did you know this?

Not at all

Perfectly

The thought that “there ought to be a test for…” is impetus to
developing a new test.

How well did you know this?

Not at all

Perfectly

The stimulus could be knowledge of psychometric problems with other
tests, a new social phenomenon, or any number of things.

T C

How well did you know this?

Not at all

Perfectly

The process of setting rules for
assigning numbers in measurement.

Scaling

How well did you know this?

Not at all

Perfectly

are instruments
to measure some trait, state, or ability and may be categorized in many ways

Scaling

How well did you know this?

Not at all

Perfectly

was very influential in the
development of sound scaling methods

LL Thorndike

How well did you know this?

Not at all

Perfectly

grouping of words, statements, or symbols on which
judgments of the strength of a particular trait, attitude, or emotion are
indicated by the test taker.

Rating Scales

How well did you know this?

Not at all

Perfectly

Developed to be “a practical means of assessing
what people believe, the strength of their convictions, as well as individual differences in moral
tolerance” (p

Morally Debatable Behavior Scale Revision

How well did you know this?

Not at all

Perfectly

Each item presents the test taker with five alternative responses (sometimes seven), usually on an agree–disagree or
approve–disapprove continuum.

Likert Scale

How well did you know this?

Not at all

Perfectly

Offers a continuum of responses that allow for measurements of
attitudes on various topics

Likert Scale

How well did you know this?

Not at all

Perfectly

Test takers must choose between two alternatives according to some rule.

Method of Pair Comparisons

How well did you know this?

Not at all

Perfectly

For each pair of options, test takers receive a higher score for
selecting the option deemed more justifiable by the majority of a group
of judges.

Method of Pair Comparisons

How well did you know this?

Not at all

Perfectly

Entails judgments of a stimulus in
comparison with every other stimulus on the scale

Comparative Scaling (Sorting Task)

How well did you know this?

Not at all

Perfectly

Stimuli are placed into one of two or more
alternative categories that differ quantitatively with respect to some
continuum.

Categorical Scaling

How well did you know this?

Not at all

Perfectly

Items range sequentially from weaker to stronger
expressions of the attitude, belief, or feeling being measured.

Guttman Scale

How well did you know this?

Not at all

Perfectly

provide a list of terms
and the individual
selects that most
characteristic of
herself or himse

Study These Flashcards

Adjective checklist

provide a list of adjectives that must be sorted into nine piles of increasing similarity to the target person.

Study These Flashcards

Q - Sorts

Guide for item Writing

Study These Flashcards

Define clearly what you wish to measure
2. Generate pool of items
3. Avoid items that are exceptionally long
4. Be aware of the reading level of those taking the scale and the
reading level of the items
5. Avoid items that convey two or more ideas at the same time
6. Consider using questions that mix positive and negative
wording

The reservoir or well from which items will or will not be
drawn for the final version of the tes

Study These Flashcards

Test Pool

Includes variables such as the form, plan, structure, arrangement, and layout of individual test items.

Study These Flashcards

Item Format

Items require test takers to select a
response from a set of alternative responses

Study These Flashcards

Selected-response format

Items require test takers to supply or to create the correct answer, not merely to select it.

Study These Flashcards

Constructed response format

Multiple-choice format has three elements:

Study These Flashcards

1) a stem, (2) a correct
alternative or option, and (3) several incorrect alternatives or options
variously referred to as distractors or foils.

Distractions

Study These Flashcards

b: standardized behavioral
samples; c: reliable assessment instruments; and d: theory-linked measures

A relatively large and easily accessible collection of test questions.

Item Bank

An interactive, computer- administered test-taking process wherein items presented to the test taker are based in part on the test taker’s performance on previous items.

Computerized Adaptive Testing

A discrepancy between scoring in an anchor protocol and the scoring of another protocol is referred

Scoring Drift

refers to the revalidation of a test on a sample of test takers other than those on whom test performance was originally found to be a valid predictor of some criterion.

Cross Validation

test validation process conducted on two or more tests using the same sample of test takers.

Co validation

Allows test developers to evaluate the validity of items in relation to a criterion measure.

Item validity Index

Indicates how adequately an item separates or discriminates between high scorers and low scorers on an entire test.

Item discriminatory Index

The quality of each alternative within a multiple-choice item can be readily assessed with reference to the comparative performance of upper and lower scorers.

Analysis of item alternatives:

is an item that favors one particular group of examinees in relation to another when differences in group ability are controlle

Biased Test Item

Test Development Flashcards

(33 cards)