Test Construction Flashcards

1
Q

Reliability

A

consistency, repeatability rxx (0 to 1.0)(.85 is 85% reliable and 15% error) (0.8 is the minimum for acceptability)
o Test-retest-correlate 1st and 2nd score (r)
o Alternate forms- correlate form A and B
o Internal consistency
§ Splif half (correlate odd and even items) (Spearman-Brown)
§ KR 20/21 (use for dich items), coefficient alpha (cronbach’s alpha-use for mulitple scored items)(split test in half in every way/permutation then average all possible split hlfs)
o Inter-rater reliability-kappa
§ Tries to reduce subjectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Factors affecting the reliability coefficient

A

§ Numbers of items-more items more reliable
§ More homogeneous items more reliable (similar content)
§ Unrestricted range of scores for more reliability-achieved by having diverse (heterogeneous) subjects (high, med and low scorers) and mid-rane difficulty items
§ Lower reliability if easy to guess (i.e. t/f)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Cohen’s d effect size

A

0.2 is small, 0.5 is med, 0.8 is large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Standard error of measurement

A

Deals with reliability- index of the amount of error is due to the unreliability of the test- helps establish the CI range of a measured score

The lower the SD the higher the rxx and lower the SEM

When rxx is 1 the SEM is 0 and when rxx is 0, SEM=SD

68% CI is use 1 SEM to get range
95% CI- 2 SEM
99% CI- 3 SEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Content Validity

A

test familiarity with particular content or bx domain

use expert opinion- content validity ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

construct validity

A

hypothetical trait

convergent and discriminant validity
factor analysis
multitrait-multimethod matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

criterion-related

A

estimate or predict standing or performance

concurrent (similar time, estimate current status) and predictive validity (make predictions)

Interpreting Criterion-related validity coefficients:

shared variability is established by squaring the coeff.

rxy=0.6 become .36-36% of score is accounted for by predictive relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Standard Error of the Estimate (SEE)

A

used to construct a CI around a predicted (estimated) criterion score, criterion-related validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Taylor-Russell Tables

A

table used to find probability of selecting successful employee using a certain measure (true positive or good hires or satisfactory performers)(i.e. 80% of new hires selected with new measure will be satisfactory performers)
Will need the following:
Criterion-related validity coefficient-correlation between score on measure and job performance
Base rate-rate of success before predictor is introduced (0 to 1)
o After the predictor is introduced, look at amount of improvement in the success rate (Incremental validity) (results from the addition of the predictor test)
o Moderate base rates optimizes incremental validity
o Selection ratio- ratio of openings in company over the amount of applicants (low selection ratio optimize incremental validity) (1 opening many applicants)
o Optimized incremental validity when baserate is 0.5 and selection rate is 0.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Correction for attenuation

A

Correction for attenuation adjusts for the change in reliability on validity (reliability is never perfect but can adjust to show how much more valid it would be with perfect reliability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ipsative measures

A

ipsative measures (/ˈɪpsətɪv/; from Latin: ipse, ‘of the self’) are those where respondents compare two or more desirable options and pick the one they prefer most.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ceiling and Floor Effects

A

measure doesn’t include an adequate range of items at the extremes, eg floor effect occurs when a test is unusually difficult and many test-takers score at or near the bottom of the scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cross-Validation

A

test is often revalidated with a sample of individs diff from the original validation sample, “shrinkage” refers to the reduction in a criterion-related validity coeff upon cross-validation, shrinkage is greatest if the original validation sample is small, the orginal item pool is large, the number of items retained is small relative to the number of items in the item pool, and/or items are not chosen on the basis of a previously formulated hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly