applied scale construction Flashcards
(44 cards)
what does construct validity mean?
Does a test measure the construct that it claims to measure
Unscientific to presume validity - face validity can deceive
Cannot rely on authority - must prove case using standard rigorous steps
what is face validity?
when you assume someone is valid because it looks valid
may hide underlying invalid constructs
desirable, not enough, not even necessary
what are the preconditions that must be met before you statistically measure construct validity?
Necessary steps for validity must be present - involves discrimination, reliability, structure
Validity must be present - involves patterns of links to other things
what are the steps taken to statistically measure construct validity?
- Measure a construct statistically by giving it a score called the observed score
- If you measured it perfectly you get a hypothetical score called the true score - can never measure any construct perfectly - always some error/ invalidity
- If the link between the observed score and the true score were perfectly matched, you would have a perfect measurement (validity)
- If the link is mismatched/ not perfectly linked, you would have an imperfect measurement (error/invalidity)
- Find the correlation between the observed score and the true score on a scale on 0–>1
what are the two sources of invalidity?
Systematic error - bias in a particular direction - caused by particular thing you can identify
Random error - bias in no particular direction - caused by different things you cannot identify
what are the 5 standard steps to achieve construct validity in questionnaires?
- Item design - coming up with the questions itself - can be open-ended or close-ended
- Item analysis - to achieve discrimination
- Reliability analysis - to achieve reliability
- Factor analysis - to achieve structure
- Scale validation - involves convergent and discriminant validation
item design: what type of items are included in a questionnaire?
Close-ended - quantitative information, more common, more convenient, efficient, more top-down
Open-ended - qualitatively rich, generate own thoughts in response, must be coded to turn into numbers, labour-intensive, more bottom-up
item design: how to write good items?
- Avoid vague, complex, obscure expressions
- Avoid items that pull for biased responses
- Bear in mind what people can and can’t know
- Try to cover fully the construct of interest
item design: bad examples of items
Oxford Capacity Analysis by Scientologist
item design: what is scaling?
applying a particular type of number to a response on a questionnaire
used for close-ended items
item design: how is an item scaled?
- Convert psychological content to a number
- Various methods of varying sophistication
- Interval-level measurement assumed
item design: what are the response options?
Decide on how much or few options there are
More options get more information - studies show that validity approaches its maximum between 5-7 items
item design: what is the impact of labelling on responses to questionnaires?
Label every item to reduce ambiguity of what items mean - by standardising you are reducing variation
item design: what is the impact of neural/uncertain responses on responses to questionnaires?
can increase information capture and therefore validity and accuracy
OR
can increase laziness and therefore decrease information capture and accuracy and validity SO
no overall benefit
item design: what is the impact of forward-scored and reverse-scored items on responses to questionnaires?
should be roughly equal
reduces acquiescence bias
permits test of completing scale seriously by screening out non-standard respondents
item analysis: discrimination in questionnaires and items
- An item should discrimination between different types of people otherwise no individual differences are assessed
- Items eliciting the same response are useless
item analysis: how is variation measured statistically?
Desirable statistical features:
○ More dispersion –> higher SD - SD of scores on a scale is a direct measurement of dispersion of variability or variation in scores
○ Central average –> middling M - items scores then to clump into a bundle
○ Symmetric distribution –> lower SKEW - skew is a direct measure of the asymmetry or imbalance in a distribution of scores
item analysis: what are the levels of discrimination?
○ Good distribution - broader spread of scores, mean is close to the scale midpoint, distribution is balanced
○ Bad distribution - narrower spread of scores, mean is lower than scale midpoint, distribution is positively skewed
item analysis: what is the ideal criteria when using a 5-point scale?
M because 2-4
SD less than 1
what is internal consistency?
consistency between items
what is test-retest reliability?
consistency over time
what is inter-rater reliability?
consistency between scores
what is scale reliability?
how consistently a scale produces similar results when measuring the same construct multiple times
reliability analysis: why does scale reliability matter?
Scale reliability matters because it is a precondition for validity
- A scale must be reliable and consistent to be valid but a reliable scale can be invalid due to random error