Ch. 4 Flashcards
(37 cards)
Measurement
Is the assignment of scores to individuals so that the scores represent some characteristic of the individuals.
Psycometrics
A subfield of psychology concerned with the theories and techniques of psychological measurement.
The important point here is that measurement does not require any particular instruments or procedures. What it does require is some systematic procedure for assigning scores to individuals or objects so that those scores represent the characteristic of interest.
Constructs
Psychological variables that represent an individual’s mental state or experience, often not directly observable, such as personality traits, emotional states, attitudes, and abilities.
Psychological constructs cannot be observed directly.
One reason is that they often represent tendencies to think, feel, or act in certain ways.
Another reason psychological constructs cannot be observed directly is that they often involve internal processes.
conceptual definition
Describes the behaviors and internal processes that make up a psychological construct, along with how it relates to other variables.
operational definition
A definition of the variable in terms of precisely how it is to be measured.
For any given variable or construct, there will be multiple operational definitions.
These measures generally fall into one of three broad categories
- Self-report measures
- behavioural
- physiological
Self-report measures
Measures in which participants report on their own thoughts, feelings, and actions.
Behavioral measures
Measures in which some other aspect of participants’ behavior is observed and recorded
This is an extremely broad category that includes the observation of people’s behavior both in highly structured laboratory tasks and in more natural settings.
physiological measures
Measures that involve recording any of a wide variety of physiological processes, including heart rate and blood pressure, galvanic skin response, hormone levels, and electrical activity and blood flow in the brain.
converging operations
When psychologists use multiple operational definitions of the same construct—either within a study or across studies.
Levels of Measurement
Four categories, or scales, of measurement (i.e., nominal, ordinal, interval, and ratio) that specify the types of information that a set of scores can have, and the types of statistical procedures that can be used with the scores.
Important for at least two reasons.
First, they emphasize the generality of the concept of measurement.
- Although people do not normally think of categorizing or ranking individuals as measurement, in fact, they are as long as they are done so that they represent some characteristic of the individuals.
Second, the levels of measurement can serve as a rough guide to the statistical procedures that can be used with the data and the conclusions that can be drawn from them.
Interval and ratio-level measurement are typically considered the most desirable because they permit for any indicators of central tendency to be computed (i.e., mean, median, or mode).
Also, ratio-level measurement is the only level that allows meaningful statements about ratios of scores.
nominal level
A measurement used for categorical variables and involves assigning scores that are category labels.
Category labels communicate whether any two individuals are the same or different in terms of the variable being measured.
The essential point about nominal scales is that they do not imply any ordering among the responses.
Nominal scales thus embody the lowest level of measurement
ordinal level
A measurement that involves assigning scores so that they represent the rank order of the individuals.
Ranks communicate not only whether any two individuals are the same or different in terms of the variable being measured but also whether one individual is higher or lower on that variable.
ordinal scales allow comparisons of the degree to which two individuals rate the variable.
ordinal scales fail to capture important information that will be present in the other levels of measurement we examine.
In particular, the difference between two levels of an ordinal scale cannot be assumed to be the same as the difference between two other levels
(just like you cannot assume that the gap between the runners in first and second place is equal to the gap between the runners in second and third place).
In our satisfaction scale, for example, the difference between the responses “very dissatisfied” and “somewhat dissatisfied” is probably not equivalent to the difference between “somewhat dissatisfied” and “somewhat satisfied.” Nothing in our measurement procedure allows us to determine whether the two differences reflect the same difference in psychological satisfaction.
Statisticians express this point by saying that the differences between adjacent scale values do not necessarily represent equal intervals on the underlying scale giving rise to the measurements.
interval level
A measurement that involves assigning scores using numerical scales in which intervals have the same interpretation throughout.
they do not have a true zero point even if one of the scaled values happens to carry the name “zero.”
ratio level
A measurement that involves assigning scores in such a way that there is a true zero point that represents the complete absence of the quantity.
You can think of a ratio scale as the three earlier scales rolled up in one.
Like a nominal scale, it provides a name or category for each object (the numbers serve as labels).
Like an ordinal scale, the objects are ordered (in terms of the ordering of the numbers).
Like an interval scale, the same difference at two places on the scale has the same meaning.
However, in addition, the same ratio at two places on the scale also carries the same meaning
Reliability
Refers to the consistency of a measure
Test-Retest Reliability
When researchers measure a construct that they assume to be consistent across time, then the scores they obtain should also be consistent across time.
Assessing test-retest reliability requires using the measure on a group of people at one time, using it again on the same group of people at a later time, and then looking at the test-retest correlation between the two sets of scores.
This is typically done by graphing the data in a scatterplot and computing the correlation coefficient.
Internal Consistency
The consistency of people’s responses across the items on a multiple-item measure.
internal consistency can only be assessed by collecting and analyzing data.
One approach is to look at a split-half correlation.
Internal Consistency
split-half correlation
A score that is derived by splitting the items into two sets and examining the relationship between the two sets of scores in order to assess the internal consistency of a measure.
splitting the items into two sets, such as the first and second halves of the items or the even- and odd-numbered items.
Then a score is computed for each set of items, and the relationship between the two sets of scores is examined.
Internal Consistency
Cronbach’s α
A statistic that measures internal consistency among items in a measure.
Conceptually, α is the mean of all possible split-half correlations for a set of items.
For example, there are 252 ways to split a set of 10 items into two sets of five.
Cronbach’s α would be the mean of the 252 split-half correlations.
Note that this is not how α is actually computed, but it is a correct way of interpreting the meaning of this statistic.
Again, a value of +.80 or greater is generally taken to indicate good internal consistency.
Inter-rater Reliability
The extent to which different observers are consistent in their judgments.
Many behavioral measures involve significant judgment on the part of an observer or a rater.
Interrater reliability is often assessed using Cronbach’s α when the judgments are quantitative or an analogous statistic called Cohen’s κ when they are categorical.
Inter-rater reliability would also have been measured in Bandura’s Bobo doll study.
In this case, the observers’ ratings of how many acts of aggression a particular child committed while playing with the Bobo doll should have been highly positively correlated.
Validity
The extent to which the scores from a measure represent the variable they are intended to.
a measure can be extremely reliable but have no validity whatsoever.
Face Validity
The extent to which a measurement method appears, on superficial examination, to measure the construct of interest.
Most people would expect a self-esteem questionnaire to include items about whether they see themselves as a person of worth and whether they think they have good qualities.
can be assessed quantitatively — for example, by having a large sample of people rate a measure in terms of whether it appears to measure what it is intended to —it is usually assessed informally.
very weak kind of evidence
One reason is that it is based on people’s intuitions about human behavior, which are frequently wrong.
It is also the case that many established measures in psychology work quite well despite lacking face validity.
Content Validity
The extent to which a measure reflects all aspects of the construct of interest.
is not usually assessed quantitatively.
Instead, it is assessed by carefully checking the measurement method against the conceptual definition of the construct.
Criterion Validity
The extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with.
A criterion can be any variable that one has reason to think should be correlated with the construct being measured, and there will usually be many of them.