Week 11 and 12: Reliability and Validity Flashcards

Question 1

Q

Define reliability

Answer

A

Consistency in measurement

Question 2

Q

List 3 ways that consistency of scores occurs when re-examining the same people

Answer

A

the same test on different occasions
different set of items measuring the same thing
different conditions of testing

Question 3

Q

What is standard error of measurement?

Answer

A

An estimate of the amount of error usually attached to an examinee’s obtained score

Question 4

Q

What is a confidence interval

Answer

A

Confidence that you have that the population mean is within that interval

Question 5

Q

What are some sources of random error?

Answer

A

test construction
test administration
test scoring and interpretation
test construction error

Question 6

Q

List the ways of testing reliability

Answer

A

cronbach’s alpha
test retest
split half
item total correlations

Question 7

Q

How big should a reliability coefficient be?

Answer

A

Above .8, preferably .9

Question 8

Q

What does cronbach’s alpha measure

Answer

A

A set of all possible correlations between test items

Question 9

Q

What is split half reliability

Answer

A

Taking half the items and seeing how they correlate with the other half

Question 10

Q

What are item total correlations

Answer

A

Getting the item and comparing it to the rest of the scale

Question 11

Q

What is test-retest reliability

Answer

A

correlation between two testing intervals
stability over time
uses Pearson’s r

Question 12

Q

What are some problems with test-retest reliability

Answer

A

affected by factors associated with how the test is administered on each occasion
carryover effect: remember answer, practice effect
should only be used for meaningful data

Question 13

Q

Internal consistency

Answer

A

The correlations between different items on the same test, or with the entire test

Question 14

Q

Kuder-richardson reliability and coefficient alpha

Answer

A

based on the intercorrelations among all comparable parts f the test

Question 15

Q

Kuder-richardson formula 20

Answer

A

calculated by the proportion of people who pass and fail each item and the variance of the test scores

Question 16

Q

Inter-rater reliability

Answer

A

agreement through multiple raters

- measured using a kappa statistic

Question 17

Q

Kappa statistic

Answer

A

Measures inter rater agreement for qualitative (categorical) items

Question 18

Q

Parallel-forms reliability

Answer

A

Equivalent forms of the same test are administered to the same group

Question 19

Q

Types of reliability

Answer

A

inter-rater
test-retest
split half
parallel forms

Question 20

Q

Validity

Answer

A

The extent to which a test measures what it is supposed to measure

Question 21

Q

What are the three types of validity

Answer

A

content
criterion related
construct

Question 22

Q

Content validity

Answer

A

Degree to which content (items) represents behaviour/characteristics associated that trait

Question 23

Q

What are the two types of criterion validity

Answer

A

Predictive and concurrent

Question 24

Q

What is criterion validity

Answer

A

The relationship between test scores and some type of criterion or outcome, such as ratings, classifications or other test scores

Question 25

Q

Concurrent validity

Answer

A

Refers to whether the test scores are related to some CURRENTLY AVAILABLE criterion measure

Question 26

Q

Predictive validity

Answer

A

The correlation between a test and criterion obtained at a FUTURE time e.g. ATAR scores predicting success at uni

Question 27

Q

Validity coefficient

Answer

A

Correlation between test scores and some criterion

Question 28

Q

What are the two types of construct validity?

Answer

A

Convergent and discriminant

Question 29

Q

Construct validity

Answer

A

The extent to which a test measures a psychological construct or trait

Question 30

Q

Convergent validity

Answer

A

Convergent validity takes two measures that are supposed to be measuring the same construct and shows that they are related.

Question 31

Q

Discriminant validity

Answer

A

Discriminant validity shows that two measures that are not supposed to be related are in fact, unrelated.

Question 32

Q

List the types of reliability

Answer

A

test-retest
internal
interrater

Question 33

Q

In test-retest reliability, what are some sources that might affect a result?

Answer

A

time
place
mood
temperature
noise

Question 34

Q

What are some core issues with content validity?

Answer

A

the appropriateness of the questions and domain relevance
comprehensiveness
level of mastery assessed

Question 35

Q

What are some procedures to ensure content validity?

Answer

A

specialist panels to map content domain
accurate test specifications
communication of validation procedures in test manual

Question 36

Q

What are some applications of content validity?

Answer

A

achievement and occupational tests

- usually not appropriate for personality or aptitude tests

Question 37

Q

What is standard error

Answer

A

The population level of standard deviation

Question 38

Q

Do we want small or large SEM

Answer

A

Small, because larger lowers reliability and increases confidence intervals

Question 39

Q

Which confidence level is most common?

Answer

A

z = 1.96 (95%)

Question 40

Q

Why are confidence intervals better than p-values

Answer

A

p value is a random arbitrary number
p values are biased towards high samples
p-values don’t pick up on small effects that reoccur consistently

Question 41

Q

When is a confidence interval result significant

Answer

A

When the confidence interval doesn’t overlap 0

Question 42

Q

Why are effect sizes beneficial?

Answer

A

They address significant affects that don’t mean much in real life e.g. does someone .5 higher on depression really have a worse time

Question 43

Q

How can test scoring and interpretation be a source of random error (reliability)?

Answer

A

Because projective tests are all answered differently, there is a large role for inter rater disagreement e.g. TAT, rorschach

Question 44

Q

What is the domain sampling model?

Answer

A

Test items represent a sample of all possible items

Question 45

Q

What is the reliability ratio

Answer

A

Variance of observed score on test divided by variance of true score on long test

Question 46

Q

How many items should you have for optimal reliability

Question 47

Q

List some examples of concurrent validity

Answer

A

depression scale and clinical interview
2 measures at a similar time
IQ and exam scores

Question 48

Q

To be concurrently valid what kind of assessments should the measure be correlated with

Answer

A

The gold standard

Question 49

Q

What kind of test do you use for predictive validity

Answer

A

Multivariate ANOVA

Question 50

Q

What test do you use for convergent validity

Answer

A

Factor analysis

Question 51

Q

The lower reliability, the…

Answer

A

Higher the error in a test

Question 52

Q

The larger the Standard error of measurement, the

Answer

A

Less precise measurements and larger confidence intervals

Brainscape's Knowledge GenomeTM

Week 11 and 12: Reliability and Validity Flashcards

Brainscape's Knowledge Genome^TM