Week 4 - PSYCHOLOGICAL MEASUREMENT Flashcards

Question

Describe Face validity

Answer 1

:) Face validity is the extent to which a measurement method **appears “on its face” to measure** the construct of interest. Can be assessed qualitatively and quantitatively EXAMPLE - by having a LARGE SAMPLE of people rate a measure in terms of whether it appears to measure what it is intended to—it is **usually assessed informally.** Face validity is at best a very WEAK kind of evidence that a measurement method is measuring what it is supposed to. One reason is that it is** based on people’s intuitions** about human behavior, which are frequently wrong. It is also the case that many established measures in psychology work quite well despite lacking face validity. EXAMPLE **The Minnesota Multiphasic Personality Inventory-2 (MMPI-2)** measures many personality characteristics and disorders by having people decide whether each of over** 567 different statements applies to them** —where MANY of the statements **do not have any** obvious relationship to the construct that they measure. EXAMPLE the items “I enjoy detective or mystery stories” and “The sight of blood doesn’t frighten me or make me sick” both measure the suppression of aggression. In this case, it is not the participants’ literal answers to these questions that are of interest, but rather whether the **pattern of the participants’ responses** to a series of questions matches those of individuals === who tend to suppress their aggression.

Answer 2

Content validity is the extent to which a measure **“covers” the construct** of interest. EXAMPLE if a researcher conceptually defines test anxiety as involving both **sympathetic nervous system activation** (leading to nervous feelings) and **negative thoughts**, then his measure of test anxiety should include items about both nervous feelings and negative thoughts. Like face validity, content validity is not usually assessed quantitatively. Instead, it is assessed by **carefully checking the measurement method against the conceptual definition of the construct**.

Answer 3

Criterion validity is the extent to which people’s scores on a measure are **correlated with other variables** (known as criteria) that one would expect them to be correlated with. EXAMPLE People’s SCORES of test anxiety should be = NEGATIVELY correlated with their **performance** on an important school exam. A criterion can be any variable that one has **reason** to think should be **correlated** with the construct being measured. There will usually be MANY of them. **Concurrent validity** When the criterion is MEASURED at the **same time as the construct** **Predictive validity** When the criterion is measured at --- some point in the FUTURE (after the construct has been measured) because scores on the measure have **“predicted” a future outcome**. **Convergent validity** NEW measures positively correlated with EXISTING established measures of the same constructs. Assessing convergent validity requires **collecting data using the measure**. EXAMPLE Researchers John Cacioppo and Richard Petty did this when they created their self-report **Need for Cognition Scale** to measure how much people value and engage in thinking (Cacioppo & Petty, 1982)[1]. In a series of studies, they showed that people’s scores were POSITIVELY correlated with their scores on a **standardized academic achievement test**, and that their scores were NEGATIVELY correlated with their scores on a measure of **dogmatism** (which represents a tendency toward obedience). In the years since it was created, the Need for Cognition Scale has been used in literally hundreds of studies and has been shown to be correlated with a wide variety of other variables, including the effectiveness of an advertisement, interest in politics, and juror decisions (Petty, Briñol, Loersch, & McCaslin, 2009)[2].

Answer 4

Discriminant validity, on the other hand, is the extent to which scores on a measure are **NOT correlated with measures of variables** that are conceptually distinct. EXAMPLE Self-esteem is a **general attitude toward the self** that is fairly stable over time. NOT THE SAME as mood. EXAMPLE **Need for Cognition Scale** (Cacioppo and Petty) WEAK correlation between people’s **need for cognition** and - COGNITIVE STYLE - TEST ANXIETY - Tendency to respond in socially desirable ways (extent to which they tend to think analytically by breaking ideas into smaller parts or holistically in terms of “the big picture.”)

Answer 5

Broadly speaking, there are four steps in the measurement process: (a) conceptually defining the construct (b) operationally defining the construct (c) implementing the measure (d) evaluating the measure. In this section, we will look at each of these steps in turn.

Answer 6

CLEAR & COMPLETE conceptual definition of a construct Allows you to make SOUND decisions about EXACTLY **how to measure** the construct. EXAMPLE Memory - conceptualized as a set of semi-independent systems **PRECISION** required - Long term memory, working memory, short term memory require DIFFERENT conceptual definitions DIFFERENT forms of measurement

Answer 7

Operational definition is a definition of the variable in terms of precisely how it is to be measured. Abstract concepts = Observation is at the heart of the scientific method Conceptual definitions MUST BE TRANSFORMED into something that can be **directly observed and measured**. EXAMPLE **Perceived Stress Scale** (Cohen, Kamarck, & Mermelstein, 1983) [1], cortisol concentrations in their saliva, or the number of stressful life events they have recently experienced.

Answer 8

It is usually a good idea to use an existing measure that has been **used successfully** in previous research. (a) Save time and trouble (b) already evidence of validity (c) your results can more easily be compared with and combined with previous results. EXAMPLE The Ten-Item Personality Inventory (TIPI) is a self-report questionnaire that measures all the Big Five personality dimensions with just 10 items (Gosling, Rentfrow, & Swann, 2003)[2]. It is **not as reliable or valid** as longer and more comprehensive measures, BUT a researcher might choose to use it when **testing time is severely limited** (EXTRA INFO - JUST IN CASE - EXTRA READING) When an existing measure was created primarily for use in scientific research, it is usually described in detail in a published research article and is free to use in your own research—with a proper citation. You might find that later researchers who use the same measure describe it only briefly but provide a reference to the original article, in which case you would have to get the details from the original article. The American Psychological Association also publishes the Directory of Unpublished Experimental Measures and PsycTESTS, which are extensive catalogs/collections of measures that have been used in previous research. Many existing measures—especially those that have applications in clinical psychology—are proprietary. This means that a publisher owns the rights to them and that you would have to purchase them. These include many standard intelligence tests, the Beck Depression Inventory, and the Minnesota Multiphasic Personality Inventory (MMPI). Details about many of these measures and how to obtain them can be found in other reference books, including Tests in Print and the Mental Measurements Yearbook. There is a good chance you can find these reference books in your university library.

Answer 9

Creating Your Own Measure - NO existing measure - Evaluate CONVERGENT validity ISSUES in creating new measures that apply equally to self-report, behavioral, and physiological measures. 1. most new measures in psychology are really variations of existing measures, so you should still look to the research literature for ideas. EG. the famous **Stroop task** (Stroop, 1935)[3]—in which people quickly name the colors that various color words are printed in—has been ADAPTED for the **study of social anxiety**. People high in social anxiety are slower at color naming when the words have negative social connotations such as “stupid” (Amir, Freshman, & Foa, 2002)[4]. 2. Strive for SIMPLICITY. Create a set of CLEAR instructions using SIMPLE LANGUAGE that you can present in writing or read aloud (or both). It is also a good idea to include one or more practice items so that participants can become familiar with the task. 3. BREVITY - however, needs to be weighed against the fact that it is nearly always better for a measure to include MULTIPLE rather than a single item. There are two reasons for this. One is a matter of **content validity.** MULTIPLE items are often required to **cover a construct adequately.** The other is a matter of **reliability.** People’s responses to single items can be **influenced by all sorts of irrelevant factors** Remember, however, that multiple items must be structured in a way that allows them to be **combined into a single overall score by summing or averaging.**

Answer 10

You will want to implement any measure in a way that MAXIMIZES its **reliability and validity.** In most cases, it is best to TEST everyone under **similar conditions** Be aware also that **people can react in a variety of ways** to being measured that REDUCE the reliability and validity of the scores. Disagreeable participants - Disrupt Agreeable participants - respond to socially desirable ways // EXPECTATIONS EXAMPLE IN BUILT DEMAND CHARACTERISTICS A participant whose attitude toward exercise is measured immediately after she is asked to read a passage about the dangers of heart disease might reasonably conclude that the passage was meant to improve her attitude. May respond favourable Own expectations cause BIAS Precautions - minimize reactivity Procedures clear and brief as possible Guarantee **anonymity** Group tests - seated FAR AWAY from each other Give same pens and paper Blind test - minimize bias STANDARDIZE ALL INTERACTIONS

Answer 11

Once you have used your measure on a sample of people and have a set of scores, you are in a position to evaluate it more thoroughly in terms of reliability and validity. Even if the measure has been used extensively by other researchers and has already shown evidence of reliability and validity, you should not assume that it worked as expected for your particular sample and under your particular testing conditions. Regardless, you now have additional evidence bearing on the reliability and validity of the measure, and it would make sense to add that evidence to the research literature. In most research designs, it is not possible to assess test-retest reliability because participants are tested at only one time. For a new measure, you might design a study specifically to assess its test-retest reliability by **testing the same set of participants at two separate times**. In other cases, a study designed to answer a different question still allows for the assessment of test-retest reliability. For example, a psychology instructor might measure his students’ attitude toward critical thinking using the same measure at the beginning and end of the semester to see if there is any change. Even if there is no change, he could still look at the correlation between students’ scores at the two times to assess the measure’s test-retest reliability. It is also customary to assess **internal consistency** for any multiple-item measure—usually by looking at a **split-half correlation or Cronbach’s α.** Criterion validity can be assessed in various ways. For example, if your study included more than one measure of the same construct or measures of conceptually distinct constructs, then you should look at the correlations among these measures to be sure that they fit your expectations. Note also that a successful experimental manipulation also provides evidence of criterion validity. Recall that MacDonald and Martineau manipulated participant’s moods by having them think either positive or negative thoughts, and after the manipulation, their mood measure showed a distinct difference between the two groups. This simultaneously provided evidence that their mood manipulation worked and that their mood measure was valid. But what if your newly collected data cast doubt on the reliability or validity of your measure? The short answer is that you have to ask WHY. It could be that there is something wrong with your measure or how you administered it. It could be that there is something wrong with your conceptual definition. It could be that your experimental manipulation failed. EXAMPLE - if a mood measure showed no difference between people whom you instructed to think positive versus negative thoughts, maybe it is because the participants did not actually think the thoughts they were supposed to or that the thoughts did not actually affect their moods. In short, it is “back to the drawing board” to revise the measure, revise the conceptual definition, or try a new manipulation.

Answer 12

CUES that might **indicate the aim** of a study to participants. These cues can lead to participants CHANGING THEIR BEHAVIOUR OR RESPONSES based on what they think the research is about.

Week 4 - PSYCHOLOGICAL MEASUREMENT Flashcards

(36 cards)