Validity and Reliability Flashcards

1
Q

What are the three aspects of reliability?

A

stability, internal consistency and interrater agreement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is stability?

A

Test-retest reliability: the extent to which the same results are obtained on repeated applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is internal consistency?

A

the extent to which all of the items of the measure address the same underlying concept. Items that propose to measure the same general construct should produce similar scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is interrater agreement?

A

The extent to which the results are in agreement when different individuals administer the same instrument to the same individuals/groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two types of interrater reliability?

A
  • intra-rater: indicates how consistently a rater administers and scores a measure
  • inter-rater: how well more than one rater agree in the way they administer and score the measure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is validity?

A

The degree to which an instrument measures what it intends to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the types of validity?

A

face, content, consensual, criterion, construct and predicitive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is face validity?

A

The relevance of the measurement - do the questions yield relevant info to the topic investigated? The perceived relevance to the test taker.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What may indicate poor face validity?

A

many ‘don’t know’ answers in a questionnaire

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is content validity?

A

The extent to which a measure represents all facets of a given phenomena or covers the whole concept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is consensual validity?

A

a number of experts agree that the measure is valid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is criterion validity?

A

concurrent and predictive validity. The extent to which the test agrees with a gold standard test known to represent the phenomena accurately.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is predictive validity?

A

The extent to which the measurement can depict what may occur in the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is construct validity?

A

The extent to which an assessment measures a theoretical construct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the types of construct validity?

A
  • convergent: scale should be related to variables and other measures of the same construct
  • discriminative: demonstrate that is discriminates between groups and individuals
  • factorial: items go together to create factors (correlation between test and major factors)
  • discriminant: new test should not correlate with dissimilar, unrelated constructs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is external validity?

A

The tool’s generalisability to other settings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What research evidence is needed for construct validity?

A
  • hypothesis about relationship between variables
  • select test items of behaviors that represent the construct
  • collect data to test hypothesis
  • determine if data supports hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a Rasch analysis used for?

A

determines unidimensionality of test items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the two types of criterion validity?

A
  • concurrent - with established measure

- predictive - with future outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What research evidence is needed for predictive validity?

A
  • identify criterion behavior and population sample
  • administer test and keep until criterion data is available
  • obtain measure of performance on each criterion
  • determine strength of relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When is a measure reliable?

A

If it is`stable over time, across different examiners and across different forms of the measure

22
Q

What are the different types of reliability?

A
  • test-retest (stability)
  • internal consistency (homogeneity)
  • inter-rater (agreement)
  • intra-rater (agreement)
  • parallel form (agreement)
23
Q

What are correlation statistics?

A

descriptive measures that show direction (positive and negative) and degree (how strong) of the relationship between two variables

24
Q

What does an r value between 0 and 0.1 mean?

A

positive relationship

25
Q

What does an r value between -0.1 and 0 mean>

A

negative relationship

26
Q

Outline the strength of a relationship when the value lies between 0.0 - 0.25

A

little or no relationship

27
Q

Outline the strength of a relationship when the value lies between 0.25 - 0.50

A

fair degree of relationship

28
Q

Outline the strength of a relationship when the value lies between 0.5 - 0.75

A

moderate to good relationship

29
Q

Outline the strength of a relationship when the value >0.75

A

good to excellent relationship

30
Q

What statistical test should be used for categorical data?

A
  • kappa statistic
  • weighted kappa
  • Spearman rho’s correlation
31
Q

What statistical test should be used for continuous data?

A
  • intraclass correlation (ICC)

- Pearson’s correlation

32
Q

How do you calculate the reliability parameter?

A

variability between study objects divided by variability between raters + variability between study objects + measurement error

33
Q

When does the reliability parameter approach 1?

A

if measurement error is small compared to the variability between persons

34
Q

What are agreement statistics?

A

The degree to which scores taken on one occasion are different to scores from another occasion

35
Q

What is the formula of absolute agreement?

A

score 1 = score 2 (expressed as a %, i.e. 100%)

36
Q

What are limits of agreement?

A

assessment of the variability between scores

37
Q

When are agreement statistics used?

A

when examining inter-rater and intra-rater reliability

38
Q

What is the standard error of measurement?

A

estimated amount of error in a measurement

39
Q

What is internal consistency?

A

degree to which all test items measure the same construct

40
Q

What research is needed for internal consistency?

A
  • a range of items in rest administered to a sample
  • correlation between items is assessed
  • correlation between items and overall test score (item-total correlation)
41
Q

What are acceptable levels of internal consistency?

A
  • conbach’s alpha between 0.7 & 0.9

- Rasch analysis showing unidimensionality

42
Q

What is the importance of good test-retest reliability?

A

if the score changes when the person’s ability has not, we can’t be confident we can measure change when it occurs

43
Q

What research evidence is needed for test-retest reliability?

A
  • tests are administered on two or more occasions
  • time between is not too soon (subjects remember the test) and not too long (changed ability)
  • same clients and raters are used each time
44
Q

What are acceptable levels of test-retest reliability?

A

ICC >0.7

Kappa or weighted kappa >0.7

45
Q

Why is inter-rater reliability important?

A

important for when clients must change services or if more than one therapist sees the client

46
Q

What research is needed for inter-rater reliability?

A
  • studies between 50 - 100 clients and >5 raters

- therapists administer the same performances of the test independently from one another

47
Q

What is an acceptable inter-rater reliability?

A

ICC >0.7

kappa or weighted kappa >0.7

48
Q

What research is needed to determine intra-rater reliability?

A
  • 50-100 clients and >1 rater

- time between tests usually brief or if possible the same performance is assessed

49
Q

What effects intra-rater reliability?

A

experience of the rater using the test

50
Q

What are acceptable levels of intra-rater reliability?

A

ICC >0.7

Kappa or weighted kappa >0.7

51
Q

What is parallel form?

A

correlation between scores for the same person on two or more forms of the test

52
Q

What are acceptable levels of parallel forms?

A

ICC >0.8