writing and evaluating test items Flashcards
(37 cards)
how do you choose format of items?
Choice of format comes from objectives and purpose of the test
Item writing guidelines
- define clearly what you want to measure
- generate and item pool
- avoid long items (tedious to read)
- keep reading difficulty appropriate (education level)
- use clear and concise wording (avoid-double barreled and double negatives)
- mix positively and negatively worded items in the same test
- make sure items are culturally neutral as possible
- make content relative to the purpose
How to write MCQ items
vary position of correct answer
all distractors must be plausible
True/false Qs
Both statements same length
Equal numbers of both
5 types of item format
- Dichotomous format
- Polytomous Format
- Likert format
- category format
- checklists and q sorts
Dichotomous format
- 2 alternatives
- True/False
- Yes/No
Dichotomous format advantages
- ease of administration
- quick scoring
- requires absolute judgement
Dichotomous format disadvantages
- less reliable (50% of getting an item correct, less range of scores when it comes to analyses)
- encourages memorisation
- often truth is not black/white
Polytomous format
more than 2 alternatives
MCQ questions
Polytomous format- distractors
- incorrect alternatives
- ideal to have 3-4 distractors to retain pscyhometric properties
- must be as plausible as the correct answer
- no cute distractors
- make the test more reliable
- but difficult to find good distractors
Polytomous format- advantages
- easy to administer and score
- requires absolute judgement
- less likely to guess correctly than a dichotomous test
Correction for guessing
R – W/n – 1
Number of right answers minus the number of wrong answers divided by the number of choices for each item (minus 1)
R = number of correct
W = number of wrong
N = number of alternatives
Omitted answers are excluded in this calculation
Likert Format
Named after Likert, who 1st used it for attitude scale
- idicates degree of agreement
- 6-point scale (or even number of options) used to avoid the neutral response
- Reverse score negatively-worded items
- use statments
- popular for attitude and personality scales
Category Format
On a scale of 1 to 10…
Research suggests 7 best
Category Format- disadvantages
- Tendency to spread responses across all categories
- Susceptible to the groupings of things being rated (context)
- Element of randomness
When is Category Format used?
- People are highly involved with a subject
E.g., asking people in townships to rate service delivery
More motivated to make a finer discrimination - Want to measure the amount of something
E.g., road rage experienced in a given situation - Make sure your endpoints are clearly defined
Visual analogue scale
Checklists
- Common in personality measures
- A list of adjectives, check which ones describe you best
Q-sorts
- place statements into piles
- piles indicate degree to which you think a statement describes a person/yourself
- category format implicit here
Item analysis
Item analysis is a general term used to describe a set of methods used to evaluate test items. Item difficulty and item discriminability are the most basic of these methods.
Item Difficulty
- the proportion of people who get a particular item correct
- higher value= easier the item
- p = number answered item correctly / number taking the measure
Optimum difficulty level (ODL)
-between 0.30 and 0.70
Example: MCQ test with 4 alternatives
-4 answer options, therefore chance = 0.25
-Halfway between 100% and chance: (1.00 - 0.25)/2 = 0.375
-Add chance: 0.375 + 0.25 = 0.625
(Add chance because we require a difficulty level of at least chance)
-ODL = 0.625
exceptions to optimum difficulty level
- At times we need more difficult items e.g., selection process
- At times we need more easier items e.g., special education
- At times we need to consider other factors e.g., boost morale
Item discriminability
Have those who did well on particular items also done well on the overall test?
Good item discriminability when:
People who do well on test overall get the item correct (and vice versa)