Test Development Flashcards

1
Q

It is the product of the thoughtful and sound application of established principles of test construction

A

Test development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

1st step of Test development

A

Test conceptualization (what, how, who, when, should?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Preliminary research surrounding the creation of a prototype of the test

A

Pilot study/research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

2nd step of test development

A

Test construction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Process of setting rules for assigning number in measurement

A

scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Credited for being at the forefront of efforts to develop methodologically sound scaling methods

A

LL Thurstone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Type of scale the consists grouping of words, statement, symbols on which judgments of the strength of a particular trait, attitude, emotion are indicated by the test-taker

A

Rating scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A scale where the final score is obtained by summing the ratings across all items (e.g. Likert Scale)

A

Summative scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A scale where test takers are presented with pairs of stimuli which they are asked to compare

A

Method of paired comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Entails sorting tasks and judgments of a stimulus in comparison with every other stimulus on the scale (e.g. sort items from most justifiable to least justifiable)

A

Comparative scaling (ordinal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Stimuli placed into one of two or more alternative categories that differ quantitatively with respect to some continuum

A

categorical scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Respondents who agree with stronger statements of the attitude will also agree with the milder statements

A

Guttman scale (ordinal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Item analysis procedure and approach to test development that involves a graphic mapping of a testtaker’s responses

A

Scalogram analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Scaling method used to obtain data that are presumed to be in interval in nature

A

Equal-appearing intervals (thurstone)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Reservoir or from which items will or will not be bdrawn for the final version of test

A

Item pool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Parts of a multiple-choice item format question

A

stem (sentence)
correct option
distractors/foils

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Also called as short-answer item

A

Completion item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Limitations of essay items

A

Focus on a liimited area; subjectivity in scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Relatively large and easily accessible collection of test questions

A

item bank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Interactive, computer-administered test taking process wherein items presented to the testtaker are based in part on the testtaker’s performance on previous items

A

Computerized-adaptive testing (CAT)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Ability of the computer to tailor the content and order of the presentation of test items on the basis of responses to previous items

A

Item branching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Most commonly used scoring model

A

Cumulative scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A type of scoring used by some diagnostic systems wherein individuals must exhibit a certain number of symptoms to qualify to a specific diagnosis

A

Class/categorical scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Compare testtaker’s score on one scale within a test to another scale within that same test

A

Ipsative scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
3rd step in test development
test tryout
26
4th step in test development
Item analysis
27
Items that spur motivation and positive testtaking attitude and lessen anxiety
Give away items
28
Percent of people who said yes, agreed, endorsed the item not who pass the item
Item endorsement index
29
Range of the optimal item difficulty
0.3-0.8(easy)
30
Formula for OID
chance performance +1/2
31
OID for true-false item
0.75 (chance=0.5)
32
OID for multiple choice item 4 options
0.63 (chance=0.25)
33
OID for multiple choice item 5 options
0.60 (chance=0.2)
34
Equal to the product of the item-score standard deviation and the correlation between the item score and the total test score
Item reliability index
35
Item Analysis Technique for Questions with right/wrong answers
Item Difficulty Item Discrimination Distractor Analysis
36
Item Analysis Techniques for either right/wrong answers or self-report scales
Item reliability index Cronbach's alpha
37
Equal to the item score SD and correlation between item score and criterion score
Item validity index
38
How adequately an item separates or discriminates between high scorers and low scorers on the entire test
Item discrimination index
39
What are the key properties of the Item-discrimination index?
Symbolised by d * Compares performance on a particular item by the high ability group & the low ability group (i. e. the top 27% and the bottom 27%) * Items that discriminate well will have a high positive score (to a maximum of 1) * A negative d value is a red flag as it means low test takers are doing better on that item than high test takers
40
The quality of each alternative within a multiple choice item can be readily assessed with reference to the comparatives performance of upper and lower scorers
Analysis of item alternatives (test developer can get an idea of the effectiveness of a distractor by means of a simple EYEBALL Test
41
Graphic representation of item difficulty and item discrimination
Item characteristic curve (the steeper the slope, the greater the item discrimination)
42
Test developer addresses the problem of guessing by including in the test manual...
- explicit instructions regarding this point for the examiner to convey to the examinees (ex. instruct answer only if certain) - specific instructions for scoring and interpretting omitted items
43
Can be used to identify biased items
item characteristic curves
44
Different shapes of item-characteristic curves for different groups when 2 groups do not differ in total test score
Differential item functioning
45
Rely primarily on verbal rather than mathematical procedures to explore how individual test items work
Qualitative item analysis (thru group discussion, interviews)
46
Approach to cognitive assessment entails having respondents verbalize thoughts as they occur
think aloud test administration (one-on-one basis)
47
Conducted during the test development process in which items are examined for fairness to all prospective testtakers and for the presence of offensive language, stereotypes or situations
Sensitivity review
48
last step in test development
test revision
49
Test revision in the life cycle of an existing test
*APA suggests that an existing test be kept in its present form as long as it remains useful but that it should be revised when significant changes in the doman represented or new conditions of test use and interpretation make the test inappropriate for its intended use
50
Revalidation of a test on a sample of testtakers other than those on whom test performance was originally found to be a valid predictor of some criterion
cross validation (key step in test development)
51
Decrease in item validities that inevitable occurs after corss-validation of findings
Validity shrinkage (is expected and integral to test development process)
52
Test validation conducted on 2 or more test using the same sample of testtakers
co-validation (also referred as co-norming)
53
Examiners undergo training of test administration using test manual
Quality assurance
54
A test protocol scored by a highly authoritative scorer that is designed as a model for scoring and a mechanism for resolving scoring discrepancies; ensure consistency in scoring
anchor protocol
55
A discrepancy between scoring in an anchor protocol and the scoring of another protocol
scoring drift
56
Evaluate how well an individual item is working to measure different levels of the underlying construct
IRT information curves
57
Item functions differently in one group of testtakers as compared to another group as compared to another group of testtakers known to have the same level of difficulty of the underlying trait (by culture, gender, age)
Differential item functioning (DIF)
58
Test developers scrutinize group-by-group item response curves looking for DIF items
DIF analysis
59
Items that respondents from different groups at the same level of underlying trait have different probabilities of endorsing a function of their group membership
DIF items
60
An advantage of the response format of the test
Great breadth (cover many topics)