scientific method
general procedures psychologists use for gathering and interpreting data
Define theory as it relates to research methods.
organized, testable explanation of phenomena
Other researchers must be able to replicate the results of an experiment to validate its conclusions.
What is replication?
obtaining similar results to a previous study using the same methods
What is hindsight bias?
explaining why something happened after it has occured
What is a controlled experiment?
researchers systematically manipulate a variable and observe the response in a laboratory
hypothesis
prediction of how two or more factors are related
How do researchers specifically define what variables mean?
Researchers use operational definitions to precisely describe variables in relation to their study. For example, "effectiveness of studying" can be operationally defined with a test score.
What is the difference between an independent variable and a dependent variable in an experiment?
The factor being manipulated is the independent variable. The factor being measured is the dependent variable.
If we test the hypothesis that students who Brainscape to study, rather than simple flash cards, will learn more (as measured by higher test scores), then what is the independent variable? What is the dependent variable?

independent: method of studying (Brainscape versus regular flashcards)

dependent: amount learned, as measured by their test scores
Define population as it relates to research methods.
all the individuals to which the study applies
Define sample as it relates to research methods.
subgroup of a population that constitutes participants of a study
What type of sample should be used in research?
Larger sample sizes are ideal because they are the most representative of the population.
The amount of difference between the sample and population is called __________.
sampling error
Define random selection as it relates to research methods.
every individual from a population has an equal chance of being chosen for the sample
Which individuals are in the experimental group?
subjects who receive the treatment or manipulation of the independent variable
Which individuals are in the control group?
subjects who do not receive any treatment or manipulation
Subjects who receive the treatment are part of the __________, while those who do not receive the treatment belong to the __________.
experimental group; control group
What type of experimental design uses experimental and control groups?
A betweensubjects design uses an experimental group and a control group to compare the effect of the independent variable.
Which process is used to try to ensure there are no preexisting differences between the control group and the experimental group?
Random assignment is used to assign the sample participants into groups (e.g., experimental drug or placebo).
Random assignment means neither the experimentor nor the participants decide in which group the participants will be, and each participant has an equal chance of being assigned to a given study groups (e.g., treatment vs. placebo).
confounding variable

any difference between the experimental group and the control group, besides the effect of the independent variable

a.k.a. third variable

makes the phenomenon at hand even more difficult to study because of complex interaction effects
List four types of confounding variables.

experimenter bias

demand characteristics

placebo effect

lack of counterbalancing
Define experimenter bias as it relates to confounding variables.
Experimenter bias occurs when a researcher's expectations or preferences about the results of the study influence the experiment.
Define demand characteristics as they relate to confounding variables.
clues the participants discover about the intention of the study that alter their responses
Define placebo effect as it relates to confounding variables.
responding to an inactive drug with a change in behavior because the subject believes it contains the active ingredient
What is the Hawthorne effect?
individuals who are being experimented on behave differently than in their everyday life
What type of experimental design uses each participant as his/her own control?
A withinsubjects design exposes each participant to the treatment and compares their pretest and posttest results. This design can also compare the results of two different treatments administered.
What is a singleblind procedure?
research design in which the subjects are unaware if they are in the control or experimental group
What is a doubleblind procedure?
research design in which neither the experimenter nor the subjects are aware who is in the control or experimental group
Singleblind procedures aim to eliminate the effects of __________, while doubleblind procedures use a third party researcher to omit the effects of __________.
demand characteristics; experimenter bias
How are quasiexperiments different from controlled experiments?
Random assignment is not possible in quasiexperiments.
What types of research are considered quasiexperiments?
Differences in behavior between:

males and females

various age groups

students in different classes
correlational research

establishes a relationship between two variables

does not determine cause and effect

used to make predictions and generate future research
List three methods of data collection
 naturalistic observation
 surveys
 tests
Which two conditions must be met for an experiment to be considered a true experiment?
 the researcher manipulates the independent variable
 all participants are randomly assigned to the experimental and control condition
So, for instance, a study that compares how men versus women do on a given task would not be a true experiment because it is not possible to assign people to group (gender). (This example would be a quasiexperiment.)
Define naturalistic observation as it relates to correlational research.
Naturalistic observation consists of field observation of naturally occuring behavior, such as the way students behave in the classroom. There is no manipulation of variables.
What are surveys and why are they not always accurate?

type of correlational research

questionnaires and interviews given to a large group of people about their thoughts or behavior

individuals aim to be politically correct and socially accepted, leading them to give false answers
Define tests as they relate to correlational research.
research method that measures individual traits at a specific time and place
__________ studies start by looking at an effect and then attempt to determine the cause.
Ex post facto
What is the difference between the reliability and validity of a test?
•Reliable – consistent
When administered properly, does a test give similar results when used on different occasions?
•Valid – useful, meaningful
Does it measure what it claims to measure?
In order to be valid, a measure must be reliable. However, a measure can be reliable without being valid. For instance, imagine a scale that always reads 212 pounds, no matter what the weight is of the person who stands on it. That scale would be a reliable measure, but not a valid measure.
What is a case study?

detailed examination of one person or a small group

beneficial for understanding rare and complex phenomena in clinical research

not always representative of the larger population
experiments
Strengths:
 determine cause and effect relationship between variables
 control over confounding variables
Weaknesses:
 no realworld generalizability
 expensive
 timeconsuming
correlational research
Strengths:
 easy to administer surveys or tests
 inexpensive
 minimal time needed
 substantial realworld generalizability
Weaknesses:
 no control over confounding variables
 skewed or biased results
 establishes a relationship, not causation
statistics
analysis of numerical data regarding representative samples
1) __________ data include measurements, such as scores on the Wisconsin Card Sorting Task (behavioral example) or scores on the Magical Ideation Scale (self report example), that can be readily expressed using numbers.
2) __________ data, such as clinical interviews, can be very descriptive and rich, but are challenging and ambiguous to interpret.
1) Quantitative
2) Qualitative
What are the four scales of measurement?

nominal

ordinal

interval

ratio
nominal scale
Data that are categorical: Numbers have no meaning except for convenience as labels.
Examples:
Hair Color (possibly coded red = 1; grey = 2; black = 3; brown = 4; blond = 5...)
Political Party (possibly coded Democrat = 1; Republican = 2; Independent/Other =3)
Gender (Male = 1; Female = 2; Prefer not to reply = 3).
ordinal scale
numbers are used as ranks
Examples:
The runner who wins the race is scored as 1, the runner who comes in second is scored as 2, the third is scored as 3, and so on.
interval scale
numbers that have a meaningful difference between them
Example:
Temperature: The difference between 10°F and 20°F is the same as between 30°F and 40°F.
ratio scale
numbers that have a meaningful ratio between them on a scale with a real zero point
Example:
Weight and height: If you weight zero pounds, you have no weight. 100 pounds is twice as heavy as 50 pounds.
Would temperature of Celcius and Farenheit be measured on an interval scale or a ratio scale?
interval
If the temperature is 0°F, there is not "no temperature." There is not a meaningful ratio between values. 100°F is not twice as hot as 50°F.
What are descriptive statistics?
numbers that summarize a set of research data from a sample
frequency distribution
an orderly arrangement of scores indicating the frequency of each score
What is the difference between a histogram and a frequency polygon?
A histogram is a bar graph and a frequency polygon is a line graph or a bell curve.
central tendency
Measures of central tendency describe the most typical scores for a set of research data.
 mode
 median
 mean
mode
most frequently occurring score in the data set
median
the middle score when the data is ordered by size
mean
arithmetic average of the scores in the data set
If two scores appear most frequently, the distribution is __________, and if there are three or more appearing most frequently, it is __________.
bimodal; multimodal
Which measure of central tendency is the most representative? The least representative?

mean is usually most representative, unless there are extreme outliers that pull the mean in a particular direction

median is less sensitive to outliers, but is a weak statistic

mode is the least representative
normal distribution
a bellshaped, symmetrical curve that represents data about many characteristics, including the distribution of many human characteristics
In a normal distribution, approximately two thirds of the population will be within plus or minus one standard deviation of the norm (mean). Approximately 95% of the population will be within plus or minus two standard deviations of the mean. Over 99% of the population will fall within plus or minus three standard deviations of the mean.
When most of the scores are compacted on one side of the bell curve, the distribution is said to be __________.
skewed
Positively skewed distributions include a lot of small values and negatively skewed distributions include a lot of large values.
measures of variablity
Measures of variability describe the dispersion of scores for a set of research data.
 range
 variance
 standard deviation
range
difference between the largest score and the smallest score
What do variance and standard deviation measure?
average difference between each score and the mean of the data set
Taller, narrow curves have less variance than short, wider curves.
What is a z score (a.k.a. standard score)?

allows for comparison between different scales

subtract mean from each score and divide by standard deviation

mean has a z score of zero
percentile score
percentage of scores at or below a particular score between 1 and 99
Example:
If you are in the 70th percentile, 70% of the scores are the same as or below yours.
Pearson correlation coefficient

statistical linear measure of the relationship between two sets of data

varies from 1 to +1

helps to make predictions about variables

perfect positive correlation

no relationship

perfect negative correlation

r = +1
direct relationship: as one variable increases or decreases, the other does the same

r = 0
no relationship

r = 1
inverse relationship: as one variable increases or decreases, the other does the opposite
direct relationship: as one variable increases or decreases, the other does the same
no relationship
inverse relationship: as one variable increases or decreases, the other does the opposite
What type of graph plots single points to show the strength and direction of correlations?
scatterplot
What is the term for the line on a scatterplot that follows the trend of the points?
line of best fit or regression line
What is the difference between a null and an alternative hypothesis?
Null hypotheses state that a treatment had no effect, while alternative hypotheses state the treatment did have an effect in the experiment.
What is the difference between a Type I and Type II error?
Type I errors, or false positives, occur if the researcher rejects a true null hypothesis. Type II errors, or false negatives, occur if the researcher fails to reject a false null hypothesis.
What is a p value?
The p value lets you know if the finding is statistically significant, i.e., the likelihood of the findings being the result of chance. The lower the p score, the less likely it is that the findings are due to chance.
In order for a finding to be considered statistically significant, the p score must be less than or equal to .05; in other words, a %5 or less likelihood that the finding is due to chance.
When is a finding statistically significant?
In psychology, a finding is considered statistically significant if the probability (alpha) that the finding is due to chance is less than 1 in 20 (p is less than or equal to 0.05)
What method statistically combines the results of several research studies to reach a conclusion?
metaanalysis
Why did the American Psychological Association (APA) implement ethical guidelines?

Guidelines were set in place in the late 20th century to stress responsibility and morality in research and clinical practice

Dangerous and inhumane experiments such as Harlow's rhesus monkeys, Zimbardo's prison roleplaying, and Milgram's shock test led to the implementation of rules
What are the purposes of an Institutional Review Board (IRB)?
 approve research being conducted at their particular institution
 require participants give informed consent after hearing the risks and procedures
 require debriefing of participants afterward with results of research
 require humane and ethical treatment of animal and human subjects
__________ psychology is practical and designed for real world application, while __________ psychology is focused on research of fundamental principles and theories.
Applied; basic
Who founded the first psychology research lab?
Wilhelm Wundt
_______ was one of the first psychologists to demonstrate that one could study psychological processes using experimental psychology.
Hermann Ebbinghaus
Describe the work of Oswald Kulpe.
Kulpe was one of the earliest experimental psychologists who performed numerous experiments to prove his "imageless thought" to try and combat Titchener's work and prove that there were some thoughts that did not have images to be analyzed.
Who was the first psychologist to introduce mental testing to the United States?
James McKeen Cattell
Who created the first intelligence test and what was its initial purpose?
The first intelligence test was created by Simon and Binet in 1905 for the purposes of ranking the intelligence of French children to select for mentally retarded children.
______ was a term developed by William Stern, which describes the ratio between someone's chronological and his/her mental age.
Intelligence quotient (IQ)
Who authored the StanfordBinet Intelligence test?
Lewis Terman
If I were to test a population of people taking care to sample a proportionate amount to the actual composition of the group, which kind of sampling would I be using?
stratified random sampling
If I know something may be a confounding factor, and I create pairs of participants based on similar levels of this factor to eliminate its effect, this is called_____?
matchedsubjects design
counterbalancing
This is an experimental technique in which we make sure both the experimental and control group will experience both levels of the independent variable, just at different times.
Mary designed an experiment in which the groups were not randomly assigned and so the control and experimental groups were not the same, what kind of group design is this?
nonequivalent group design
If the results of my experiment are applicable to the entire population, my experiment is said to have __________ .
external validity
If I make inferences from a data set that go beyond the actual data points, this would be _________.
inferential statistics
An _______ is an extremely large or extremely small number that affects the measure of central tendency such that it is no longer accurately representative of the sample.
outlier
What are the properties of a normal distribution?
A normal distribution is represented by a normal curve. The scores will exist such that 68% of the scores are within 1 standard deviation of the mean and 96% of the scores will fall within 2 standard deviations of the mean.
Tscore
Similar to a Zscore, a Tscore sets up a curve such that the mean is always 50 and each standard deviation is 10. You simply convert each number to the Tscore value for easy comparison and analysis.
What is the difference between a positive correlation and a negative correlation?
A positive correlation is one in which if one value increases, the other value will increase. A negative correlation is one in which if one value decreases, the other value increases.
What does a scatterplot look like?
The ________is the line one draws on the scatterplot to best represent the relationship between the two values.
line of best fit
factor analysis
Factor analysis uses multiple sets of correlations to see which variable correlations cluster together to create a factor or group of variables which are presumed to be measuring the same value, based on their high rates of correlation.
Describe the difference between the null hypothesis and the research hypothesis.
The null hypothesis states that there is no relationship between the two values tested. The research hypothesis states that there is a statistically significant relationship between the two values in our experiment.
The _____ is the level of certainty we wish to have that there is an actual relationship between the two values in an experiment.
alpha level
This is usually set at a 1 in 20 chance or an alpha level of 0.05.
Sandy rejected the null hypothesis and believed there was a relationship between phone numbers and math ability, when in reality, it was proved that there was not a relationship. What kind of statistical error did Sandy commit?
type I error
Bobby decided to accept the null hypothesis and decided there was no relationship between IQ and a healthy diet, even though there statistically was proof that there was a relationship. What kind of error did he commit?
type II error
The probablity of making a type II error is measured by the ________ .
beta level
Which statistical test should I use if I am trying to compare three different groups or more?
analysis of variance (ANOVA)
If I only have two groups to compare, which statistical test should I use?
Ttest
Chisquare tests are used for data that is _______ rather than numerical.
categorical
What is the most common way to perform a metaanalysis?
gather as many sources about the topic as possible, examine for multiple themes, publish the results of the metaanalysis for the larger community
normreferenced testing
A test in which one's score is compared to that of all of the other testtakers, such as "Brian's score is in the 66th percentile."
__________, rather than normreferenced testing, determines how much information the testtaker knows about a certain subject, such as a history final.
Domainreferenced testing
What are three things a test must have to be reliable?

dependability

consistency

repeatability
Splithalf reliability, alternateform, and testretest method are three ways of establishing ________.
a test's reliability
validity
how much a test measures what it claims to measure
What would be the best way to test content validity?
Examining the actual content of the test to make sure that it accurately and completely meets all of the facets of the construct that are being tested.
What does the face validity of the test show?
That the questions on the test will be asking questions that appear to ask questions about the subject of the test; this is the least objective form of validity.
What would be one way to to determine the criterion validity of the SAT?
determine whether high scores on the SAT predict high GPAs in college
construct validity
how well the test addresses what you were trying to measure
Name two kinds of construct validity.

convergent validity

divergent validity
What is the difference between aptitude and achievement tests?
Someone's score on an aptitude test predicts future ability with training and growth, someone's score on an achievement test shows how much s/he knows right now.
What would a personality inventory be likely to contain?

statements about personality

questions that assess likes and dislikes

selfselected ideals
The ________ is an intelligence test specially designed for children.
Wechsler Inteligence Scale for Children (WISC)
What are some special features of the Minnesota Multiphasic Personality Inventory?
It has 10 clinical subscale scores, including a score for carelessness, faking, and distorting.
empirical criterionkeying approach
This is a process for creating test questions in which the developers choose from thousands of test questions placed in groups to differentiate between sick and healthy people with a variety of scores.
Which test is the California Personality Inventory the most like and why?
The CPI is most like the MMPI, but is especially intended for test takers ages 13 to young adult.
What is a projective test?
a test with ambiguous stimuli that has a subjective scoring system because there are limitless responses that the patient can give to the presented stimuli.
Projective tests are highly controversial. Critics point out research demonstrating projective tests' lack of reliability and validity. Yet projective tests remain in use in clinical settings and used in legal and clinical decision making.
The Rorschach Ink Blot Test is a widely used projective test. Why is using the Rorshach Ink Blot Test a problematic practice?
Projective tests are highly controversial. Unfortunately, projective tests, such as the Rorschach, have been and continue to be used in making legal determinations, (e.g., custody) despite evidence that such tests lack validity for assessing mental health (e.g., the Rorshach overpathologizes, frequently mistakenly identifying people as having mental illness when they do not.)
For an indepth discussion of the problems with using the Rorschach Ink Blot Test to assess mental health, please read: http://www.csicop.org/si/show/rorschach_inkblot_test_fortune_tellers_and_cold_reading/
To view the ink blot images, please see: https://en.wikipedia.org/wiki/Rorschach_test
The ________ is a projective test in which the patient is given a series of pictures of scenes involving different people and is instructed to tell a spontaneous story about each scene.
Thematic Apperception Test (TAT)
The TAT was developed at Harvard in the 1930s by Murray and Morgan. Murray and Morgan used ambiguous images selected from magazines. Participants construct stories basd on individuallypresented images. The test was dveloped to assess personality.
In addition to personality, the TAT has been (and contiinues to be) used to assess personal growth and mental health. However, the TAT, like other projective tests, lacks both reliability and validity. Including the TAT in a test battery can, in some circumstances, introduce enough error that it reduce the battery's overall reliability and validity.
Which projective test was especially designed for children?
Blacky pictures
Rotter Incomplete Sentences Blank
forty sentence stems that the testtaker fills out with whatever comes to mind
What are some advantages of using projective tests?
What are some disadvantages of using projective tests?
•Advantages
–Good for breaking the ice
–Some skilled clinicians may be able to use them to get information not captured in other types of tests. (maybe)
•Disadvantages
–Validity evidence is scarce; psychologists cannot be sure about what responses mean.
–Expensive and timeconsuming
–Other less expensive tests work as well or better.
What is the theme of the StrongCampbell Interest Inventory?
It is a career placement test based around the testtaker's interests.
What were Holland's six types of interests and occupational themes?

realistic

investigative

artistic

social

enterprising

conventional
What did Arthur Jensen propose?
That racial differences in IQ are genetically related.
Important critique: Jensen did not adequately address other factors, including the lack of culturefair tests, epigenetic effects, and the impact of socioeconomic status (SES) on educational opportunities and achievement. In addition, critics of Jensen's perspective note that he ignored research that was inconsistent with his hypotheses and Jensen misunderstood the nuances of heritability, resulting in Jensen making deeply flawed conclusions.
What are four factors that can undermine data quality?
•Low precision of measurement
•The state of the participant
•The state of the experimenter
•Variation in the environment
What is an a priori hypothesis?
An a priori hypothesis occurs if one has a predicted hypothesis about a relationship (and the direction of relationship) between variables prior to collecting data.
Findings based on an a priori hypothesis are considered stronger/more persuasive than findings based on a post hoc (after the fact) analysis. This is because a finding based on an a priori hypothesis is less likely to be the result of chance!
What are some strategies to help improve the quality of data you collect?
•Be careful! •Use a standardized procedure or protocol
•Measure something that is important and engages participants
•When using multiple measures, be aware of order effects (Does doing A before asking B influence the answers for B?)
•Note anything unusual about the data collection. For instance, if a fire alarm goes off during data collection,or if the participant reports being in an unusual mood or unwell, make a note of it. Similarly, if you were colecting data on mood states the day after 9/11/2001, your data would likely have been impacted by participants' reactions to current events.
Name three things that can introduce error into our research:
Culture, Biases, and Situation strongly influence our Observations, Responses, and Behaviors!!
Here is a helpful way of thinking about this issue: “…the assumptions you end up making as you try to bridge the imaginative gap are, of course, your own, and the most misleading assumptions are the ones you don't even know you're making.”
Douglas Adams & Mark Carwardine, "On Meeting a Gorilla." from Last Chance to See (writing about when they went to see gorillas in the wild)
Try, in as much as you are able, to be aware of the effects of these on you!
What is the primary aim of statistics?
To rule out randomness or chance as an explanation.
Human brains have evolved to detect patterns. A byproduct of being very good at pattern detection is that human beings are prone to sometimes perceive patterns, even when there are no patterns.
What is measurement error?
Measurement error is a threat to research validity; it is the cumulative effect of extraneous variables.
Measurement error often is referred to as noise in the data.
Measurement error also is referred to as error variance.
What are four different types of data frequently used in psychological research?
SelfReport the participants perceptions of himself or herself (e.g., data collcted from surveys or interviews)
Life Outcomes real life verifiable facts (e.g., criminal record/history of incarceration)
Behavioral Observations observing a person's behavior (e.g., how a participant performs on a task, such as a Stroop test or an IQ test)
Informant asking someone who knows the person to share their perceptions (e.g., asking a parent to describe his or her child's strengths and interests)
Shows vs No Shows (and others who refuse to participate)
In voluntary research, typically some potential participants refuse to participate. Other potential participants agree to participate then do not do so (noshows).
Why is this a problem for voluntary research?
Noshows do not provide data, so they are not represented in the data and subsequent findings.
As a group, nonparticipaters/noshows probably meaningfully differ from participants. There may be relevant, important personality or demographic differences between these groups.
Thus, noshows are a threat to study validity and the generalizability of findings.
(This is not an issue in animal research; lab mice do not have the option of deciding not to participate!)
What are “WEIRD” countries; why is this an issue?
Western, Educated, Industrialized, Rich, and Democratic.
Most psychological research is conducted in WEIRD countries (such as the U.S., Canada, and the U.K.), so findings from such research may or may not generalize to other, nonWEIRD populations.
What is th law of large numbers?
(Unless there is significant sampling error,) the larger the sample size, the more reliable and valid the findings!
What is a Type I error?
What is a Type II error?
Type I errorsaying that your results are significant when they are not. ("It works!", when, alas, in actuality the new treatment does not help). False positive
Type II errorsaying your results are not significant when they actually are. ("It's worthless!", when, alas, in actuality the new treatment could really help). Falso negative
Psychological research tends to focus on working to avoid making Type I errors, although both are harmful!
What is a response set or response bias?
Why are response sets a problem for researchers?
A response set is the tendency for a participant to have a pattern in how she or he responds to questionnaire items or interview questions, and this pattern or tendency occurs independently of the content of the items. Response sets are a problem because they introduce systematic bias/error into the data set.
What are examples? Some participants tend to say yes to researchers conducting an interview (an acquiescence bias), even when the answer is unknown, ambiguous, or even no. Other participants tend to give extreme answers. In some instances, cultural differences can lead to response sets.
What is an effect size?
An effect size is a measure of strength; it conveys the strength of the relationship/finding.
Effect sizes can be small, moderate, or large.
One commonly used and good measure of effect size is Cohen's d
What does it mean to have multiple outcome measures?
Why is it important, when possible, to design studies so that they have multiple outcome measures?
It means having more than one way to measure a dependant variable!
As long as all of the measures are valid, using multiple measures sunstantially improves your ability to detect effects/differences.
If you want to test an intervention to treat post partum depression, then you could use multiple measures, such as the BDI, a rating from a family member, and a structured clinical interview. If there is any problem collecting or interpreting a measure, having multiple outcome measures reduces the problem's impact. E.g,, what if you used only the rating from family members, and it turned out that not all of the participants have a relative close enough to them to provide a valid rating?
What is a p value?
What is an effect size?
Whereas a p value conveys the likelihood that a finding is chance, (i.e., how likely the finding is real,) an effect size conveys how big or strong that difference between the groups is.
What are some arguments against using deception in psychological experiments?
–Informed consent for deception is not possible.
–When does the deception stop?
–Harms the credibility of psychology
Why use deception in some psychological reseearch?
What safeguards are there for participants?
Sometimes researchers use deception while collecting data. Usually deception is reserved for when being straightforward could meaningfully bias/change the data.
Use of deception must be preapproved by the IRB. The potential harm must not outway the anticipated benefits, and particpants must be debriefed afterwards.
What is a standard deviation?
The standard deviation is a measure of how closely the data in a sample or population cluster around the mean.
The standard deviation is equal to the square root of the variance.
For a more indepth explanation of standard deviations, see:
https://www.khanacademy.org/math/probability/descriptivestatistics/variancestddeviation/v/statisticsstandarddeviation