Prelim 2 prep Flashcards by Darby Krugel

What are the differences between True Score Theory, Generalizability Theory, and Item Response Theory?

How well did you know this?

Not at all

Perfectly

What is the standard error of measurement?

How well did you know this?

Not at all

Perfectly

What are confidence intervals and what do they tell us?

How well did you know this?

Not at all

Perfectly

When confidence intervals increase in terms of percentage (i.e., 90% vs 95%), what does that do
to the range of scores it comprises?

How well did you know this?

Not at all

Perfectly

What does it mean if a test is valid?

How well did you know this?

Not at all

Perfectly

What are the three main categories of validity?

How well did you know this?

Not at all

Perfectly

What is: content validity, criterion-related validity, construct validity, ecological validity, external
validity, face validity?

How well did you know this?

Not at all

Perfectly

What is the content validity ratio and how is it used to determine content validity of test items?

If more than half the panelists of experts say an item is essential, has content validity, CVR 0 is half, negative if fewer than half

How well did you know this?

Not at all

Perfectly

Name the three characteristics of a criterion

Relevant, valid, uncontaminated

How well did you know this?

Not at all

Perfectly

What does it mean for a criterion to be uncontaminated?

Independent- independent group of raters decides who is good and who isn’t, then correlate that with test scores

How well did you know this?

Not at all

Perfectly

Define concurrent validity and predictive validity

Concurrent- degree to which a test score is related to some criterion measure obtained at the same time
Predictive- degree to which a test score predicts a criterion measure

How well did you know this?

Not at all

Perfectly

What are false negatives, false positives, specificity, and sensitivity?

False negative- test predicts someone doesn’t possess a trait and they do
False positive- test says someone has a trait and they don’t
Specificity- perfect wouldn’t mistakenly identify as someone having a trait when they don’t
Sensitivity- perfect identify all people who have the trait

How well did you know this?

Not at all

Perfectly

What is incremental validity and what would be proof of its existence?

Extent to which adding a second or third predictor gives more information about a criterion
proof??

How well did you know this?

Not at all

Perfectly

What is construct validity?

Extent to which a test measures a construct we are examining

How well did you know this?

Not at all

Perfectly

Name a describe the several ways in which you can find evidence for construct validity.

Homogenous
Evidence changes w age
Test scores change w experience
Distinct groups score differently
Convergent evidence between two tests measuring the same construct

How well did you know this?

Not at all

Perfectly

What is the difference between convergent and concurrent validity?

Study These Flashcards

What is a factor analysis and how does an exploratory factor analysis differ from a confirmatory
one?

Study These Flashcards

?
Exploratory- estimating or extracting factors, deciding how many to retain, rotating to an interpretable orientation ????
Confirmatory- degree to which a hypothetical model fits the data

Name and define the different types of rating error that can occur

Study These Flashcards

Leniency error- arises from tendency on part of rater to be lenient
Severity- opposite
Central tendency- rater doesn’t use extreme ends of scale
Halo effect- seeing people well no matter what

What is test utility?

Study These Flashcards

Usefulness or practical value of testing to improve efficiency, use in a particular situation helps us make better decisions

What are some of the costs of administering a test, and what are some costs of NOT
administering one?

Study These Flashcards

Administering:
- buying
- supply of blank test protocols
- computer program to score the test
- paying to score the test
- hiring people to administer the test
- costs of doing business

Not administering:
- loss of confidence as an ultimate cost of the company???
- missing a child abuser
- failing to diagnose when someone underreports on an interview???

Keep in mind the real-life example I discussed about how to think about the cost of testing when
doing evaluations.

Study These Flashcards

????

What are the Taylor Russel tables used for and what three variables are considered when using
them to decide if giving a test is “worth it.”

Study These Flashcards

COME BACK

Be able to name a few other tables (i.e., Naylor-Shine) and have a basic sense of how they work
(they could be multiple-choice option for instance)

Study These Flashcards

COME BACK

Name some different ways cut scores are determined.

Study These Flashcards

COME BACK

What’s the difference between a fixed and relative cut score?

Relative- actual score you need to meet a criteria changes Fixed- always the same

What is pilot work and why is it used?

Preliminary research surrounding creation of prototype of test, experiment with test items

Name some ways scales are graded?

Age based Grade based Unidimensional v multidimensional?? Categorical v dimensional

What are some scaling methods – and remember, they can overlap – so a categorical scale can be graded “summatively,” etc.

Rating Summative Paired comparisons Sorting tasks Categorical scaling Guttman scale ????

What is the empirical vs analytical way of writing test items?

Analytical- write test questions you think will measure the qualities you want to measure Empirical- find people with a problem, ask different types of questions, see how they respond

Why would we want to find seemingly arbitrary items for use of distinguishing one group for another (in other words, non-face-valid ones)

Name some different ways items can be formatted.

Selected response Constructed response Computerized adaptive testing?

What is computerized adaptive testing, and how does item branching work?

? Adds or deletes a branch depending on performance (meaning what)?

Name and describe a few different ways in which items can be scored

Cumulative Class/category scoring Ipsative scoring- comparing one score on one scale to another scale within the same test ?

What are the following: item-difficulty index, item endorsement index, item reliability index, item discrimination index

How are item characteristic curves useful?

Extra paper

almost done

Prelim 2 prep Flashcards

(36 cards)