Week 2 - Score Normalisation and Reliability Flashcards

Question 1

Q

‘What’ and ‘Why’ of psychological measurement

Answer

A

What - quantifying behaviour, attitudes, feelings to make inferences about constructs (unobservable attribute)
Why - assessment must be objective, thus tests use standardisation to avoid bias

Question 2

Q

Score normalisation

Answer

A

Raw score - not meaningful without comparison point (e.g. criterion referenced (pass mark), norm referenced (relative to others))
Derived score - transforming raw score to find someone’s relative position to normative sample (percentiles and z-scores)

Question 3

Q

Percentiles

Answer

A

Percentage of people who fall below particular raw score
(data points below/total values) x 100
p50 (median), p25 (1st quartile), p75 (third quartile)
Advantages - easy comparison, easy to understand, universal
Limitation - inequality of units

Question 4

Q

Standard scores (z)

Answer

A

Measure of how extreme a score is relative to normative sample (in SD)
z = (X - M)/SD
Universally applicable - can be calculated for anything if M and SD are known

Question 5

Q

Transformations of z-scores

Answer

A

T-score (0-100) = (z x 10) + 50
Sten score (1-10) = (z x 2) + 5.5
Deviation IQ (25-175) = (z x 15) + 10
Stanine (1-9) = normal distribution of nine scores with category percentages

Question 6

Q

Normalised standard scores

Answer

A

Scores only comparable if from similar distribution.
Distribution can be normalised to force compatibility
Raw score > percentile > normal curve frequency table > normalised z score

Question 7

Q

Specificity of Norms

Answer

A

Norms (M, SD) specific to population they are derived from
Problems - WEIRD, lack of Australian norms

Question 8

Q

Test reliability

Answer

A

A good test is reliable (reproducible) and valid (measuring what is intended)
Reliability - rxx (correlation between scores on two administrations of test)

Question 9

Q

True score theory

Answer

A

People have a ‘true score’, but we are never able to measure that due to errors in measurement
If we administered an infinite number of tests - mean of distribution is ‘true score’, SD of distribution is SEm

Question 10

Q

Reliability and true scores

Answer

A

Individual (observed score (x) = true score (t) + error (e))
Sample (observed score variance (s2x) = true score variance (s2t) + error variance (s2e)
Reliability (rxx) = s2t/s2x
Error variance = 1 - (rxx)
Thus, reliability is the proportion of observed score variance that is due to true score variance

Question 11

Q

Test-retest

Answer

A

Exact same test done on two occasions
Error variance - changes in conditions and test-takers between conditions (environment threats, time-related factors, order effects)

Question 12

Q

Alternate forms

Answer

A

Two versions of test administered to same people (immediate or delayed)
Error variance - differences in content, time sampling (delay)

Question 13

Q

Internal consistency

Answer

A

Split-half reliability
- One test administered but split into two halves
- Error variance - content sampling
- Problems - deciding on split, timed tests, halving test already reduces reliability
Cronbach’s Alpha and Kuder-Richardson
- One test administered, every item correlated with every other item, mean of these correlations (equals mean value of all possible split-half)
- Error variance - content sampling, heterogeneity of behaviour domain
CA - for items on scale with more than 2 options
KR - for dichotomous items

Question 14

Q

Inter-rater reliability

Answer

A

Two raters give scores for individual on a test (subjective)
Error variance - differences between raters

Question 15

Q

Additive sources of error variance

Answer

A

Different tests of reliability can be added together to find true error variance (as long as error variance type is different between tests)
E.g. delayed alternate form (time + content) + inter-rater

Question 16

Q

Sample impact on reliability

Answer

Study These Flashcards

A

R is affected by individual differences in a group
R decreases as homogeneity increases (similar scores, so variation is more likely to be error)

Question 17

Q

How high does Rxx need to be?

Answer

Study These Flashcards

A

Should be > .8

Nunnally’s heuristic:
- 0.5 for test development
- 0.7 for test in research
- 0.9 for individual assessment

Week 2 - Score Normalisation and Reliability Flashcards

(17 cards)