Module 6: Reliability and Validity Flashcards

1
Q

Reliability

A

Reliability refers to the consistency or stability of a measuring instrument. In other words, the measuring instrument must measure exactly the same way every time it is used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Systematic errors

A

Problems that stem from the experimenter and the testing situation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Trait errors

A

Problems that stem from the participants. Were they truthfull, did they feel well?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True score

A

The true score is what the score on the measuring instrument would be if there were no error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Error score

A

The error score is any measurement error (systematic or trait).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Observed score

A

The score recorded for a participant on the measuring instrument used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Conceptual formula for observed score

A

Observed score = True score + Error score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Random errors

A

Errors in measurement that lead to measurable values being inconsistent when repeated measurements of a constant attribute or quantity are taken

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Conceptual formula for reliability

A

Reliability = True score / (true score + error score)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Correlation coefficients

A

A correlation coefficient measures the degree of relationship between two sets of scores and can vary between -1.00 and +1.00. The stronger the relationship between the variables, the closer the coefficient is to either 1.00 or +1.00.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Positive correlation

A

A positive correlation indicates a direct relationship between variables: When we see high scores on one variable, we tend to see high scores on the other

Graph going from left bottom to right top

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Negative correlation

A

A negative correlation indicates an inverse, or negative, relationship: High scores on one variable go with low scores on the other and vice versa.

Graph going from left top to right bottom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Rules-of-thumb correlation coefficient

A
.29 = none to weak
.30-.69 = moderate
.70-1.00 = strong
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Types of reliability

A
  • Test/retest
  • Alternate forms
  • Split-half
  • Interrater reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Test/retest reliability

A

One of the most often used and obvious ways of establishing reliability is to repeat the same test on a second occasion. The correlation coefficient needs to be high on both tests for the reliability to increase as well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Practice effects

A

Some people get better at the second testing, and this practice lowers the observed correlation

17
Q

Alternate form reliability

A

using alternate forms of the testing instrument and correlating the performance of individuals on the two different forms

18
Q

Split-half reliability

A

where you give one group one half of the test and the other group the other half.

19
Q

Interrater reliability

A

Here you test how consistent the assessment of two or more raters or judges are

Con: you need to test the reliability of the interraters between them.

20
Q

Conceptual formula interrater reliability

A

Interrater reliability = Number of agreements / number of possible agreements x 100

21
Q

Cronbach’s alpha

A

A measure of the internal consistency as a kind of average correlation between the items

(I.e. measuring one participant in group 1 and comparing that to another participant from group 1 to see if they match up)

22
Q

Rules-of-thumb Cronbach’s alpha

A
  • > .80: reliability = good.
  • .60 - .80: reliability = sufficient.
  • Less than .60 = insufficient
23
Q

Validity

A

Validity refers to whether a measuring instrument measures what it claims to measure. The extent to which the observations reflect what we want to measure, i.e. the extent to which the observation reflects the concept or construct under investigation.

24
Q

Differences validity and reliability

A
  • Reliability refers to observations (scores)
  • Validity refers to conclusions based on observations.
  • Reliability concerns random measurement error.
  • Validity issues have to do with systematic error. E.g. our policemen are only 90 meters apart instead of 100, So, the observation does not reflect what we want to measure.

E.g. John scores higher on the IQ-test than Peter. Reliability: are we sure their true scores are different? Validity: Is John more intelligent than Peter?

25
Q

Statistically significant

A

What is important for validity coefficients is that they are statistically significant at the .05 or .01 level (with p-value)

26
Q

7 types of validity

A
  • Content validity
  • Face validity
  • Criterion validity
    • Concurrent validity
    • Predictive validity
  • Construct validity
  • Statistical Conclusion validity
  • Internal validity
  • External validity
    • Population validity
    • Ecological validity
27
Q

Content validity

A

Looks at the content of tests. Does it cover a representative sample of the domain you are researching?

28
Q

Face validity

A

Face validity is whether or not a test looks valid on its surface (not the content!). Does the operationalization appear to be valid on it’s surface?

29
Q

Criterion validity

A

Criterion validity measures how accurately an instrument the behavior or ability predicts. There are two types of criterion validity.
• Concurrent validity is used to estimate present performance. Is the test for bipolar disorder good at distinguishing people with and without depression?
• Predictive validity is used to estimate future performance. Is the personality test a good predictor for study success?

30
Q

Construct validity

A

Assesses the extent to which a measuring instrument accurately measures a theoretical construct or trait that it is designed to measure.

Some examples of theoretical constructs or traits are verbal fluency, neuroticism, depression, anxiety, intelligence, and scholastic aptitude. The conclusion that were made, can they be concluded from the research they did?

31
Q

(Statistical) Conclusion validity

A

Do the observations allow for the conclusion that variables are related?

32
Q

Internal validity

A

Does the operationalization allow for the conclusion that variables are causally related?

33
Q

External validity

A

Extent of generalizability of the conclusions
• Population validity: does the sample allow for conclusions about the target population?
• Ecological validity: does the procedure followed in the study allow for conclusions about more natural circumstances?

34
Q

Cons of test/retest-reliability

A

Con:

  • Practice effects
  • Individuals may remember how they answered previously, both correctly and incorrectly. In this case we may be testing their memories and not the reliability of the testing instrument
35
Q

Cons alternate form reliability

A

Con:

  • Difficult to make them parallel, same number of items, difficulty, etc.
  • Practice effects (not as much as test/retest)
36
Q

Con split-half reliability

A

Con:

  • Helps with reliability of itself, but not over time (the test is not being done twice in its entirety).
  • Difficult to divide the items equally