Chapter 5 Flashcards by Haley Lam

Reliability (def)

consistency in measurement

How well did you know this?

Not at all

Perfectly

Reliability coefficient

0 to 1 statistic.

How well did you know this?

Not at all

Perfectly

4 types of reliability coefficients

1) test-retest reliability

2) alternate-forms reliability

3) split-half reliability

4) inter-scorer reliability

How well did you know this?

Not at all

Perfectly

Measurement error (textbook def)

Inherent uncertainty with any measurement, even after minimizing preventable mistakes

How well did you know this?

Not at all

Perfectly

2 influences that interfere with repeated measurement (in psych)

1) changes in object (eg. a constant flux of mood, alertness, motivation)

2) the act of measurement (i.e., carryover effects like fatigue, practice)

How well did you know this?

Not at all

Perfectly

“True Score”

not actually true to concept. True score is tied to the specific measurement instrument.

How well did you know this?

Not at all

Perfectly

What ‘score’ measures the truth independent of measurement?

Construct score.

the underlying score of some construct (eg. depression)

How well did you know this?

Not at all

Perfectly

variance is made of what two subtypes of variance?

True variance (actual differences between people?) + Error variance (random variances that are irrelevant)

How well did you know this?

Not at all

Perfectly

Define reliability in terms of variance

Proportion of total variance attributed to true variance

How well did you know this?

Not at all

Perfectly

Random vs. Systematic Error

Random: unpredictable, inconsistent, without pattern

Systematic: predictable, constant, can be adjusted for

How well did you know this?

Not at all

Perfectly

Bias (in error)

The degree of systematic error that influences measurement

How well did you know this?

Not at all

Perfectly

How does item/content sampling contribute to error variance?

The specific content in some test may affect the results (eg. i hope they ask this question and not this)

How well did you know this?

Not at all

Perfectly

What test administration effects contribute to error variance?

Environment: war, heat, gum, pencil, etc.

Testtaker variables: lack of sleep, emotions, drugs, etc.

Examiner-related variables: physical appearance, presence/absence

How well did you know this?

Not at all

Perfectly

How does test scoring and interpretation contribute to error variance?

Some subjectivity in certain tests (eg. essays, creativity, etc.) can influence measurement.

How well did you know this?

Not at all

Perfectly

test-retest reliability coefficient is also called what?

Coefficient of stability

How well did you know this?

Not at all

Perfectly

What might affect test-retest reliability estimates?

Experience, practice, memory, fatigue, etc. may intervene.

How well did you know this?

Not at all

Perfectly

alternate-forms/parallel-forms reliability estimates coefficient name

Coefficient of equivalence

How well did you know this?

Not at all

Perfectly

Parallel vs. Alternate forms reliability

Study These Flashcards

Parallel forms: Means and variances of test scores are equal

Alternate forms: different versions of same test, but aren’t parallel

2 similarities between parallel/alternate and test-retest reliability

Study These Flashcards

1) two test administrations with same group

2) test scores can be affected by factors like fatigue, practice, learning, etc.

What additional source of error variance is present in alternate/parallel-forms reliability?

Study These Flashcards

Item/Content sampling

Split-half reliability

Study These Flashcards

Correlating two pairs of scores from a single test.

one half of a test Pearson r with another half, then adjust with Spearman-Brown formula

Odd-even reliability

Study These Flashcards

split-half reliability by using odd vs. even numbers

How do number of items affect reliability coefficient? What method can see how many items needed?

Study These Flashcards

Spearman-Brown.

More items is more reliability

What coefficient for inter-item consistency?

Study These Flashcards

Coefficient alpha

Inter-scorer reliability What coefficient?

Degree of consistency between 2 or more scorers. Coefficient of inter-scorer reliability

DSM-5 Inter-rater reliability

Kappa = 0.44 (fair level moderately greater than chance)

Transient error

Error due to testtaker's feelings, moods, or mental state over time

Homogeneity vs. Heterogeneity of test items

Homogenous: Functionally uniform items. Measures one factor (eg. one ability/trait). High internal consistency should happen Heterogenous: Not just one factor measured in the test.

Does high internal consistency mean homogeneity of items?

Not necessarily. More items will lead to high internal consistency coefficients as long as they're positively correlated

Dynamic vs. static characteristics

Dynamic: Presumed to be relatively situational and changing Static: presumed to be relatively unchanging

Restriction/Inflation of range

When some subgroup inflates or restricts the correlational analysis??

Power Test

Enough time to attempt all items, but so difficult that nobody gets perfect score

Speed test

Same level of difficulty in items and testtakers should complete everything correctly if unlimited time. But only some will be able to complete the whole test

What's differences in assumptions between CTT and IRT? (not specific, but ya..)

CTT assumptions are weak and easily met. IRT are rigorous.

Domain Sampling Theory

Reliability is based on how well a score assesses the domain of where a sample is drawn.

What is universe score in generalizability theory?

The true score (given same conditions, the same score will be obtained)

Generalizability Study Coefficient of generalizability

how generalizable scores from a particular test are if administered in different situations.

Decision study

Usefulness of test scores in helping user make decisions. Follows generalizability study

Another way to say Item response theory

Latent-trait theory

Within CTT, what is the weight assigned to each item on a test?

Equal weight. IRT is differentital weight.

Dichotomous test items

can only answer with one of two responses

Polytomous test items

3 or more alternative responses

Rasch Model

a type of IRT model with underlying distribution assumption

Which measure is used to compare differences between scores?

Standard error of the difference

Chapter 5 Flashcards

(44 cards)