Psychological Measurement Flashcards by Briegé Koning

Reliability

Reliability refers to the consistency of a measure. Psychologists consider three types of consistency: over time (test-retest reliability), across items (internal consistency) and across different researchers (inter-rater reliability)

How well did you know this?

Not at all

Perfectly

Validity

Validity is the extent to which the scores from a measure represent the variable they are intended to.

How well did you know this?

Not at all

Perfectly

What is measurement?

Measurement is the assignment of scores to individuals so that the scores represent some characteristic of the individuals.

How well did you know this?

Not at all

Perfectly

Psychological measurement is often referred to as

psychometrics

How well did you know this?

Not at all

Perfectly

What are psychological constructs?

We cannot accurately assess people’s level of intelligence by looking at them, and we certainly cannot put their self-esteem on a bathroom scale. These kinds of variables are called constructs (pronounced CON-structs) and include personality traits (e.g., extraversion), emotional states (e.g., fear), attitudes (e.g., toward taxes), and abilities (e.g., athleticism).

How well did you know this?

Not at all

Perfectly

The conceptual definition of a psychological construct…

The conceptual definition of a psychological construct describes the behaviors and internal processes that make up that construct, along with how it relates to other variables.

How well did you know this?

Not at all

Perfectly

An operational definition…

An operational definition is a definition of a variable in terms of precisely how it is to be measured.

How well did you know this?

Not at all

Perfectly

Behavioural measures

Behavioral measures are those in which some other aspect of participants’ behavior is observed and recorded. This is an extremely broad category that includes the observation of people’s behavior both in highly structured laboratory tasks and in more natural settings.

How well did you know this?

Not at all

Perfectly

Physiological measures

physiological measures are those that involve recording any of a wide variety of physiological processes, including heart rate and blood pressure, galvanic skin response, hormone levels, and electrical activity and blood flow in the brain.

How well did you know this?

Not at all

Perfectly

Converging operations

When psychologists use multiple operational definitions of the same construct—either within a study or across studies—they are using converging operations.

How well did you know this?

Not at all

Perfectly

Levels of measurement

levels of measurement (which he called “scales of measurement”) correspond to four types of information that can be communicated by a set of scores, and the statistical procedures that can be used with the information.

How well did you know this?

Not at all

Perfectly

Nominal level of measurement

The nominal level of measurement is used for categorical variables and involves assigning scores that are category labels. Category labels communicate whether any two individuals are the same or different in terms of the variable being measured. e.g. marital status

How well did you know this?

Not at all

Perfectly

Ordinal level of measurement

The ordinal level of measurement involves assigning scores so that they represent the rank order of the individuals. Ranks communicate not only whether any two individuals are the same or different in terms of the variable being measured but also whether one individual is higher or lower on that variable. e.g. researcher measuring consumers’ satisfaction, requesting participants to rate their feelings as ‘very dissatisfied’ ‘somewhat disatisfied’ ‘satisfied’

How well did you know this?

Not at all

Perfectly

The interval level of measurement

The interval level of measurement involves assigning scores using numerical scales in which intervals have the same interpretation throughout. e.g. celsius scale of measurement

How well did you know this?

Not at all

Perfectly

The ratio level of measurement

The ratio level of measurement involves assigning scores in such a way that there is a true zero point that represents the complete absence of the quantity. Height measured in meters and weight measured in kilograms are good examples.

How well did you know this?

Not at all

Perfectly

Test-retest reliability

Study These Flashcards

Test-retest reliability is the extent to which this is actually the case. For example, intelligence is generally thought to be consistent across time. A person who is highly
Reliability and Validity of Measurement | 95
intelligent today will be highly intelligent next week. This means that any good measure of intelligence should produce roughly the same scores for this individual next week as it does today.

Internal consistency

Study These Flashcards

Another kind of reliability is internal consistency, which is the consistency of people’s responses across the items on a multiple-item measure. In general, all the items on such measures are supposed to reflect
96 | Reliability and Validity of Measurement
the same underlying construct, so people’s scores on those items should be correlated with each other.

Split-half correlation

Study These Flashcards

Like test-retest reliability, internal consistency can only be assessed by collecting and analyzing data. One approach is to look at a split-half correlation. This involves splitting the items into two sets, such as the first and second halves of the items or the even- and odd-numbered items. Then a score is computed for each set of items, and the relationship between the two sets of scores is examined.

Inter-rater reliability

Study These Flashcards

Inter- rater reliability is the extent to which different observers are consistent in their judgments. For example, if you were interested in measuring university students’ social skills, you could make video recordings of them as they interacted with another student whom they are meeting for the first time. Then you could have two or more observers watch the videos and rate each student’s level of social skills.

Face validity

Study These Flashcards

Face validity is the extent to which a measurement method appears “on its face” to measure the construct 98 | Reliability and Validity of Measurement
of interest.

Content validity

Study These Flashcards

Content validity is the extent to which a measure “covers” the construct of interest. For example, if a researcher conceptually defines test anxiety as involving both sympathetic nervous system activation (leading to nervous feelings) and negative thoughts, then his measure of test anxiety should include items about both nervous feelings and negative thoughts.

Criterion validity

Study These Flashcards

Criterion validity is the extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with. For example, people’s scores on a new measure of test anxiety should be negatively correlated with their performance on an important school exam.

concurrent validity

Study These Flashcards

When the criterion is measured at the same time as the construct, criterion validity is referred to as concurrent validity

predictive validity

Study These Flashcards

when the criterion is measured at some point in the future (after the construct has been measured), it is referred to as predictive validity (because scores on the measure have “predicted” a future outcome).

convergent validity

one would expect new measures of test anxiety or physical risk taking to be positively correlated with existing established measures of the same constructs. This is known as convergent validity.

Discriminant validity

Discriminant validity, on the other hand, is the extent to which scores on a measure are not correlated with measures of variables that are conceptually distinct. For example, self-esteem is a general attitude toward the self that is fairly stable over time. It is not the same as mood, which is how good or bad one happens to be feeling right now. So people’s scores on a new measure of self-esteem should not be very highly correlated with their moods. If the new measure of self-esteem were highly correlated with a measure of mood, it could be argued that the new measure is not really measuring self-esteem; it is measuring mood instead.

Measurement is the

assignment of scores to individuals so that the scores represent some characteristic of the individuals. Psychological measurement can be achieved in a wide variety of ways, including self- report, behavioral, and physiological measures.

Psychological constructs

such as intelligence, self-esteem, and depression are variables that are not directly observable because they represent behavioral tendencies or complex patterns of behavior and internal processes. An important goal of scientific research is to conceptually define psychological constructs in ways that accurately describe them.

For any conceptual definition of a construct

there will be many different operational definitions or ways of measuring it. The use of multiple operational definitions, or converging operations, is a common strategy in psychological research.

Variables can be measured at four different levels—

nominal, ordinal, interval, and ratio—that communicate increasing amounts of quantitative information. The level of measurement affects the kinds of statistics you can use and conclusions you can draw from your data

Psychological researchers do not simply assume that their measures work.

Instead, they conduct research to show that they work. If they cannot show that they work, they stop using them.

There are two distinct criteria by which researchers evaluate their measures:

reliability and validity. Reliability is consistency across time (test-retest reliability), across items (internal consistency), and across researchers (interrater reliability). Validity is the extent to which the scores actually represent the variable they are intended to.

Validity is a judgment based on various types of evidence.

The relevant evidence includes the measure’s reliability, whether it covers the construct of interest, and whether the scores it produces are correlated with other variables they are expected to be correlated with and not correlated with variables that are conceptually distinct.

Good measurement begins with

a clear conceptual definition of the construct to be measured. This is accomplished both by clear and detailed thinking and by a review of the research literature.

Do you use an existing measure or create a new measure?

You often have the option of using an existing measure or creating a new measure. You should make this decision based on the availability of existing measures and their adequacy for your purposes.

What steps can be taken in creating new measures and implementing new measures?

Several simple steps can be taken in creating new measures and in implementing both existing and new measures that can help maximize reliability and validity.

What do you do once you have used a measure?

Once you have used a measure, you should reevaluate its reliability and validity based on your new data. Remember that the assessment of reliability and validity is an ongoing process.

Psychological Measurement Flashcards

(37 cards)