Chapters 2, 3, 4, Test Development and Psychometrics Flashcards

1
Q

What are four basic types of measurement scales

A

nominal, ordinal. interval, and ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the most elementary of the measurement scales, involves classifying by name based on characteristics of the person or object being measured.

A

nominal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

provides a measure of magnitude and, thus, it often provides more information than does a nominal scale.

A

ordinal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

units are in equal intervals; thus, a difference of 5 points between 45-50 represents the same amount of change as the differences of 5 points between 85 and 90 points.

A

Interval scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Has the same properties of an Interval scale without the existence of a meaningful zero.

A

ratio scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

an individual’s score is compared with scores of other individuals who have taken the same test

A

norm-referenced instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

an individual’s score is compared with an established standard or criterion

A

criterion-referenced instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

predetermined cutoff score indicates whether the person has attained and established level

A

Mastery component

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

converting scores into a _________helps understand how your score compares with the others who also took the test

A

Frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

often used assessment because this graphic representation makes the data easier to understand (x-axis -horizontal/y-axis vertical)

A

Frequency polygon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three central tendencies that are useful in interpreting individual’s results on an instrument

A

Mode
Median
Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

is the most frequent score in a distribution - highest frequency of any of the scores

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

the score at which 50% of the people had a score below it and 50% of the people had a score above it.

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The arithmetic average of the scores

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Provides a measure of the spread of scores and indicates the variability between the highest and the lowest scores. _________ is calculated by simply subtracting the lowest score from the highest score.

A

Range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

To avoid the problem of getting a zero- square the deviations, add these together and divide by the numbers of scores

A

Variance or mean square deviation

17
Q

The square root of the variance is the ___________

A

standard deviation

18
Q

scores on an instrument fall into a ______

A

normal distribution or a normal. curve (bell-shaped)

19
Q

________is one in which the majority of scores are at the lower end of the scores.

A

positively skewed distribution

20
Q

Where the majority of scores are on the higher end of the distribution

A

negatively skewed distribution

21
Q

What are example of three standard Scores

A

z Score
T Scores
Stanines

22
Q

What are the three major theories related to reliability?

A

classical test theory,
generalizability theory
item response theory

23
Q

Based on the degree that there is an error within the instrument. _________suggests thatt every score has two hypothetical components: a true score and an error component

A

Classical test theory

24
Q

_______means there is a system-methods are planned, orderly, and methodical

A

Systematic

25
means lack of system. -occurrences are presumed to be unsystematic (eg. not consistent, such as coffee spill on just one person's test)
Random error
26
estimates range of scores if someone to took test over and over again.
Standard error of measurement
27
When there is a long period between the administrations of the instrument
coefficient of stability
28
Measures consistency of how the test measures a particular construct within the group
Internal consistency
29
Helps measure internal consistency by splitting group in half and comparing results of each which should be consistent
Split-half Reliability
30
Measurement of consistency between test scores for different test scorers.
Interscorer (interrater reliability)
31
Two forms given to same person and then tested for reliability.
Parallel forms
32
Norming Groups. Created through:
*Simple random sample (every person in pop has same chance of being sampled) *Stratified sample (used in assessments –where test developer match percentage or population in terms of ethnic groups, geographic location, socioeconomic status) *Cluster sample (create sample from groups – ie random sample of child achievement achieved through random selection of clusters – ie schools).
33
Consistency of a measure.
Reliability
34
Typically Pearson r – estimates a test reliability – ranges from -1 to 1 – where 1 or -1 is a perfect relationship between the variables with no measurement error. The closer to 1 or -1, the better. 0 indicates there is no association between the variables. Small association: .1 to .3 or -.1 to -.3 Medium association: .3 to .5 or -.3 to -.5 Strong association: .5 to 1.0 or -.5 to -1.0 Correlation NOT causation!
Reliability Coefficient
35
The extent to which a test measures what it’s meant to measure.
Validity
36
How well the test or assessment evaluates all aspects of the topic, construct or behaviour
Content-related variability
37
Extent to which test or instrument measures outcome (SAT as a predictor of college performance)
Criterion-related validity
38
Extent to which test or instrument measures or reflects the intended construct (such as intelligence).
Construct validity
39
Allows us to predict values for a response.
Regression (Regression analysis used to estimate slope)