Lecture 3 Flashcards
(25 cards)
In science, need to make ‘good’ measurements of our phenomena
Theory is made up of... Whereas observations are... Constructs are... We understand constructs through... This representation is called...
Can’t measure constructs directly because…
constructs
based on data
hypothetical because we can’t measure them directly.
capturing data that represent constructs.
operationalisation
they are hypothetical, so have to measure them indirectly through variables.
Example: does savoring increase wellbeing?
Construct = well-being
Operationalisation = 5-item scale, number of smiles, brain scan
Savoring → Well-being (unmeasured constructs) =
↓ ↓
Self-report → Subject Happiness (observed variables representing constructs)
Savoring is a set of strategies used to maximize the positive emotions/feelings that we have around a pleasant event (e.g., graduation) – what we do to amplify and magnify the pleasure
theoretical association between constructs
empirical association based on data
Three Types of Operationalisation
- Self-Report
- Observational
- Physiological
Measurement of constructs (known as variables), can be better or worse.
We want to conduct research based on variables that are reliable and valid.
Why?
Reliability:
Example: thought experiment.
three self-report measures and want to determine reliability over 3 months.
Which of these three will be the most reliable?
- Mood (“how happy do you feel right now?)
- Gender (select among these: “male; female; other”)
- Optimism (tell us “do you feel that things will work out for you?”)
Because then we have confidence that they are fairly representing the construct
a measurement tool that consistently generates a similar empirical estimate.
Gender
Test-retest Reliability
Stable demographic variables are…
Psychological variables rooted in personality (optimism) are…
Quickly changing and highly variable variables (mood) are…
How do we assess reliability?
So what’s a good test-retest reliability correlation over time?
the most reliable
intermediate in stability
the lowest in reliability
Most measures of test-retest reliability are correlations of scores for the same individuals at two or more points-in-time.
Depends on the measure
The value will depend on the time between test and retest, the length of the test, what is being measured, and the characteristics of the sample. Some traits are very stable. Others may show some change over time. Thus, there is no absolute value. It depends on the situation.
I would actually NOT want my mood measure to yield a high correlation over time (probably .20), whereas I would want gender to be very high (maybe .98) and optimism to be intermediate (maybe .50 over 3 months, .70 over 1 month).
So, it depends.
If someone asks you “What is the reliability of your scale?” what do you say?
Measures of internal reliability capture the…
Example:
6 items for a subscale of grit (persistence of effort)
e.g. “setbacks don’t discourage me. I don’t give up easily.”
I would say: “What type of reliability?” There are two:
Test-retest reliability, correlation over time for the same individuals, vs. Internal reliability (e.g., Cronbach’s alpha)
average level of intercorrelation among all of the items.
SPSS
Scale > Reliability Analysis
Cronbach’s Alpha .90 to 1.00 .80 to .90 .70 to .80 .60 to .70 .50 to .60 Below .50
Excellent Good Acceptable Marginal Poor Unacceptable
measures with an alpha between .60 to .70 have been used. But in those cases reader should be warned that it might not be sufficiently internally reliable.
Basis for a good Cronbach’s Alpha:
There is an algebraic equation that combines number of items, average variance, and average covariance to…
More items boosts the…
NOTES:
More items in the scale = higher Cronbach’s alpha
Items need to correlate with each other well to get a good alpha
come up with the final numerical value.
alpha, and a higher average correlation among the items helps too. You want to remove poor items.
Poor Items
removal of items will improve the overall alpha, and shorten the scale and improve internal reliability.
Recap
Test-retest reliability tells you…
Low reliability can indicate that the scale is…
Internal reliability tells you…
A high Cronbach’s alpha indicates that…
A good scale will evidence…
whether the scale yields similar numerical values for the same individuals over time.
psychometrically poor OR it might indicate that your phenomenon is just inherently unstable.
how internally consistent the items of the measure are.
the items on the scale tend to correlate with each other to a high degree.
reasonable stability over time, and it will be internally consistent (a above .70).
Validity
want our scale to measure what we intend it to measure
(1) Content Validity:
Do the items in the scale relate to or tap the overall construct?
Does the following item assess grit:
“Setbacks don’t discourage me. I don’t give up easily”? How about “Is it easy to get out of bed in the morning?”?
whether items on the scale measure the intended construct
e.g., in the grit scale want items that capture persistence of effort
(2) Criterion Validity:
So for the grit scale, would it predict success in a job or school?
To what extent does the scale predict expected outcomes?
(3) Construct Validity:
The hypothetical construct of “grit” is defined as perseverance of effort and passion to achieve long-term goals. Consequently the scale should measure “perseverance” and “motivation to achieve goals”.
To what extent does the scale measure the intended hypothetical construct?
NOTE:
replicability of it’s ability to capture the construct and generalize
(4) Convergent Validity:
measures the extent to which the scale in question…
i.e., scores from the Grit Scale should correlate with scores from persistence scales. Correlated with hardiness, resilience, ambition, and self-control.
correlates with scales that assess something similar.
(5) Discriminant Validity:
measures the extent to which a scale does…
We’re looking for a…
A scale measuring laziness would be negatively related to grit, and that would be an example of convergent validity. One would expect IQ to be unrelated to grit, if it’s supported with future research, then that would indicate discriminant validity.
NOT correlate with scales that are expected to be unrelated.
non-significant correlation, not a negative correlation.
Why are reliability and validity good?
We want ‘good scales’, and these are defined as…
We want our scales to reliably produce…
We want our scales to measure…
If scales demonstrate good validity (all five types), then…
scales possessing reliability and validity.
a similar score for the same individuals for attributes that don’t change much (e.g., religious affiliation) and those that change moderately (optimism).
what they are intended to measure
we are confident that they measure what we intended.
In practice, what does this mean? If using a pre-existing scale, need to be assured that: 1. internal reliability is... 2. test-retest reliability is... 3. items of the scale... 4. it has been shown to... 5. it has been shown to... 6. it does not correlate...
at least acceptable
good
capture the intended construct (content validity)
predict expected outcomes (criterion validity)
correlate with similar scales (convergent validity)
with dissimilar scales (divergent validity)
CONSTRUCT VALIDITY
A scale demonstrates good construct validity if…
Construct validity is the…
Only demonstrated through…
Does the grit scale predict successful achievement of goals for individuals living in the real world?
A measure that performs well in aggregations of numerous studies is likely to have good construct validity.
numerous studies evidence all types of validity.
highest-order, most abstract type of validity.
repeated demonstrations that the scale represents the intended construct in numerous and various contexts.
Types of Variables: Nominal
Nominal variables are composed of….
Gender is a…
Gender is not assessed as binary…
numerical values that indicate membership within a particular group.
classic nominal (also called categorical) variable because an individual typically falls into one of three groups: 0 (males); 1 (females); and 2 (Other)
but it’s still nominal.
Types of Variables: Ordinal
Ordinal variables are based on….
e.g., taste-testing four beers, rank them 1st to 4th.
Only feasible with…
rankings
relatively small groups of comparisons (e.g., less than 5). Can rank on any attribute.
Types of Variables: Interval
Interval (continuous) variables are…
e.g., resilience scores vary quite a lot over the 1-5 range if you have multiple Likert-type items
variables with numerous obtained numerical values between the maximum and minimum.
Types of Variables: Ratio ratio variable is similar to... but it has... Height is on a ratio scale. e.g., errors by a rat running a maze: the rat can conceivably commit zero errors.
ordinal and interval scales
a true zero point
Usually treated as identical to interval variables, but the minimum numerical value has a special meaning.