# Test Construction Flashcards

1
Q

When using % agreement for inter-rater reliability, one problem could be that

A

it overestimates reliability due to chance agreement

2
Q

If you administer a test to another sample ____ may occur where the validity coefficient is likely to be smaller with the second sample than with the first when cross-validity

A

shrinkage

3
Q

multrait-multimatrix method is important for evaluating a test’s”

A

construct validity

4
Q

The Kuder-Richardson Formula 20 (KR-20) can be used to estimate a test’s ____________ reliability when test items are scored dichotomously.

A

internal consistency

5
Q

A test’s reliability coefficient can range from

A

0 to 1

```.90 = req. for many high stakes tests
.70 = req. for many other tests```
6
Q

Formula for standard error of measurement

A

SD*sq root of 1-reliability coeff

7
Q

You would use which of the following to construct a confidence interval around an examinee’s predicted criterion score?

A

Standard error of estimate

8
Q

Murray’s theory of personality, which describes personality as being the result of internal and external forces - resulted in the development of which personality test

A

Thematic Appreciation Test

9
Q

construct versus content validity

A

construct: does it measure what you want it to measure
content: does it adequately sample everything to cover the content of the ability

10
Q

the amount of variability in obtained test scores that’s due to true score variability

A

reliability coefficient

11
Q

Spearman-Brown prophecy formula is used to

A

correct split-half reliability, which shortens the test into 2 tests

12
Q

Chronbach’s alpha and KR-20 measure:

Whereas, Cohen’s kappa and Kendall’s coefficient measure:

A

Chronbach’s alpha and KR-20 measure: internal consistency

Cohen’s kappa and Kendall’s coefficient measure: inter-rater reliability

13
Q

Percentage scores (40/80 correct answers = 50%) and expectancy tables are examples of

A

criterion referenced scores

14
Q

Another name for % variability accounted for

A

coefficient of determination

15
Q

a method for developing personality inventories in which the items (presumed to measure one or more traits) are created and then administered to a criterion group of people known to possess a certain characteristic (e.g., antisocial behavior, significant anxiety, exaggerated concern about physical health) and to a control group of people without the characteristic. Only those items that demonstrate an ability to distinguish between the two groups are chosen for inclusion in the final inventory.

A

empirical-criterion keyeing

16
Q

The standard error of measurement is used to:

A

calculate CI