Testing Flashcards

1
Q

Reliability vs Validity

A

Consistency of measurement of a given score; Estimate of the degree to which a test is free from measurement error

Degree to which a test actually measures what it is intended to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reliability evidence:
- Test-Retest
- Internal
- Alternate form
- Interrater

A

Extent that a measure produces consistent scores across time; Looks at correlation between two scores of same sample across time
- Pearson’s r - -1 to 1 (Looks at correlation) - 0.8+ is good
- Participant characteristics, practice effects, time interval impact

Extent to which individual items within a test measure same domain or construct
- Spit-half reliability coefficient: Split test items in 2 halves and correlate score between halves
- Cronbach’s alpha: Estimates based in all possible ways of splitting test items
- # of domains (more=bad) and items (more=good) impact

Consistency of test results between 2 diff forms of a test
- Administer 2 versions of test and calculate correlation between scores

Degree of consensus between diff raters in scoring test items
- Percent agreement: Nominal data (classifications/ratings); Calculate percentage of items raters agree on (75%+ good)
- Cohen’s kappa: Calculates percentage of items raters agree on + accounts for agreement that occurs by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the interpretations for a test w/
High internal, test-retest, interrater
Low internal, high test-retest + interrater
Low test-restest, high internal + interrater

A

An ideal test for most purposes

Scores reflect a test w/ heterogenous item content; BUT scores are based on items that are measuring something other than the construct the test is designed to measure

Scores reflect a test measuring fluctuating ability; BUT scores are too vulnerable to the fx of normal variability and time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Tripartite model of validity:
- Content-related evidence
- Construct-related evidence (Convergent vs divergent)
- Criterion-related evidence (Concurrent vs predictive)

A

Extent to which a test covers full range of construct
- Subject matter experts review items’ relevance w/ content validity ratio (-1 to +1)

Extent to which a test measures theoretical construct
- Convergent: Extent to which scores positively correlate w/ existing measures of the SAME construct
- Divergent: Extent to which a measure does not correlate with measures of DISSIMILAR constructs (Should not be higher than 0.7)

Extent to which a test accurately predicts/correlates w/ specific criterion/outcome
- Concurrent: Extent to which a test correlates w/ a criterion that is measured at THE SAME TIME
- Predictive: Extent to which a test correlates w/ a criterion that is measured at SOME POINT IN THE FUTURE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly