Chapter 1 Flashcards
Deductive
Have a theory of the world and how the world works. Theory First
Inductive
Observe the world first. Data First.
Cause
The producer of an effect, result, or consequence. Mostly called Inus Conditions [an insufficient but non-redundant part of an unnecessary sufficient condition]
We cannot directly observe causal effects. Causation can only be inferred never known.
Causation requires correlation and counterfactual dependence.
Effect
The consequence of an effect.
Best understood through a counterfactual model. A counterfactual is something contrary to a fact. In an experiment we observe what did happen when people receive the treatment. A counterfactual is what would have happened if they did not receive the treatment. The effect is the difference between the two.
Experiment
Is a test under controlled conditions, randomly assigned, made to demonstrate a known truth, examine the validity of a hypothesis or determine the efficacy of something - to discover the effects of presumed causes. Experiments require that we have manipulatable variables.
Manipulate levels of an IV (treatments) to observe its affects.
Randomized Experiment: Assign cases to levels of the treatment by some random process such as using a random number table or generator.
Control Variables
Control variables are those independent variables which are not part of the research study however their influence cannot be ignored. Hence in SPSS , control variables will constitute the first block of hierarchical regression followed by the regular independent variables in the other block.
Moderator Variables
A moderator variable is one that influences the strength of a relationship between two other variables
Moderator variables are those variables which act like a catalyst in a regression relationship. They interact with the independent variables either to bring down or enhance the relationship between the independent and dependent variables.
One which determines the conditions under which a described causal relationship holds (increasing the frequency of broadcast of car commercials in which the dealer himself appears increases his sales among low-income prospective buyers but not among high-income prospective buyers). Effect of the IV depends on the value of the ModV
Mediator Variables
A mediator variable is one that explains the relationship between the two other variables.
A link in the causal chain between IV and DV. For example, educational attainment might be called a mediator variable between gender and job category. A variable is a strong mediator if it has a strong association with both IV and DV, but the relationship between the IV and the DV reduces to zero when the MedV is entered into the relationship model
Causal Description
Flickeringn the light switch on & illuminating the room.
Causal description is identifying that a causal relationship exists between A and B
Molar Causation
Molar causal conditions are characterized in terms of large and often complex objects.
Enables descriptive causation
Thus, for example, a researcher who conceptualizes causal conditions at a relatively molar level might propose that wives married to husbands who are high in negative affectivity (a personality trait) will become increasingly dissatisfied with their marriage. This researcher might argue that husbands who are high in negative affectivity are more likely (than other men) to be irritable and critical, and less likely to be affectionate, and that these behavioral tendencies might, over time, create conflict, ultimately decreasing their wives’ satisfaction.
Causal Explanation
Why did the light go on?
Causal explanation is explaining how A causes B
Molecular Causation
Molecular causation is knowing which parts of a treatment are responsible for which parts of an effect.
Enables explanatory causation
Randomized Experiments
- Units are assigned to conditions randomly
- Randomly assigned units are probabilistically equivalent based on expectancy (if certain conditions are met)
- Under the appropriate conditions, randomized experiments provide unbiased estimates of an effect
Quasi-Experiment
- Shares all features of randomized experiments except assignment
- Assignment to conditions occurs by self-selection
- Greater emphasis on enumerating and ruling out alternative explanations
- Through logic and reasoning, design, and measurement
- HAS a comparison or control group.
Involve comparisons between naturally occurring treatment groups (by self-selection or administrative selection).
Researcher does not control group assignment or treatment, but has control over when/what to observe (DV).
Example might be people who work regular daytime hours vs. the night shift;
Researcher must rely on statistical controls to rule out extraneous variables such as other ways in which the treatment groups differ than the IV of interest.
Search for counterexamples and competing explanations is inherently falsificationist, as is searching for moderators and limiting conditions
Natural Experiment
- Naturally-occurring contrast between a treatment and comparison condition
- Typically concern non-manipulable causes
- Requires constructing a counterfactual rather than manipulating one
Might typically involve before and after designs where you look at a DV of interest before and after some phenomenon that has occurred, for example, tying Presidential approval ratings to revelations about bailed-out bank excesses
Non Experimental Designs
- Often called correlational, descriptive or passive designs (i.e., cross-sectional)
- Statistical controls often used in place of structural design elements
- Generally do not support strong causal inferences
- Does not have a comparison or control group.
Non-experimental designs (correlational studies) are basically cross-sectional studies which are correlational in nature in which the researcher makes an effort to establish causal influence through measuring and statistical control of competing explanations
UTOS
Units of analysis [people, schools, molecules…]
Treatments [the variable you are interested in - ie: gender, learning in higher ed,]
Observations [outcomes, dependent & iV vars]made on units
Settings [context] in which the study is conducted
Causal Inference
[1. Well specified theory, causal framework or model for relations among constructs/concepts/variables, 2. Close mapping between constructs as theorized and operationalized, 3. Temporal order, precedence, 4. Demonstrated directional, interactive relations, 5. Ruling out confounding, selection.]
Prediction
Covariation between indicators of relevant constructs, Change in one indicators is assoicated with change in another, Observed change in one reliably preceds change in other [nonexperiential - correlation], manipulated change in one reliably leads to change in the other [experimental, quasi-experimental, natural]
Confounding Variables
can relate to selection - ie: male versus female may gravitate to different activities. Selection as a threat to validity. Location is an example - ie: students in a university.
Ecological Fallacy
The drawing of inferences about individuals directly from evidence gathered about groups, societies, or nations.
Occurs when you make conclusions about individuals based only on analyses of group data. For instance, assume that you measured the math scores of a particular classroom and found that they had the highest average score in the district. Later (probably at the mall) you run into one of the kids from that class and you think to yourself “she must be a math whiz.” Aha! Fallacy! Just because she comes from the class with the highest average doesn’t mean that she is automatically a high-scorer in math. She could be the lowest math scorer in a class that otherwise consists of math geniuses!
Individualistic Fallacy
The drawing of inferences about groups, societies, or nations directly from evidence gathered about individuals.
It occurs when you reach a group conclusion on the basis of exceptional cases. This is the kind of fallacious reasoning that is at the core of a lot of sexism and racism. The stereotype is of the guy who sees a woman make a driving error and concludes that “women are terrible drivers.” Wrong! Fallacy!
Dependent Variables
The dependent variable is what is affected by the independent variable – your effects or outcomes.
[Observed, outcome, criterion]
Independent Variable
In fact the independent variable is what you (or nature) manipulates – a treatment or program or cause.
For example, if you are studying the effects of a new educational program on student achievement, the program is the independent variable and your measures of achievement are the dependent ones.
[Causal, True, Latent]
Y [dependent variable - scores] = f(X - independent variable treatment]
X explains or predicts Y
Control Variables
Control variables are introduced to reduce the risk of wrongly attributing explanatory power to the independent variable(s) they have selected. Are the relations between the independent and dependent variables spurious? The variable used to test the possibility that he relation between the dependent and independent variables is spurious - in other words, that it can be explained only by the effect of another variable.
Internal Validity
A variable threatens internal validity if it threatens interpretation of results.
Did in fact the experimental stimulus make some significant difference in this specific instance? Did the independent and dependent variables covary in a causal relationship?
“Could there be an alternative cause, or causes, that explain my observations and results?”
Internal validity only shows that you have evidence to suggest that a program or study had some effect on the observations and results.
Internal validity is possibly the single most important reason for conducting a strong and thorough literature review.
“Is there a causal relationship between
variable X and variable Y, regardless of what
X and Y are theoretically supposed to
represent?”
If a variable is a true independent variable and
the statistical conclusion is valid, then internal
validity is largely assured.
The concern of internal validity is causal in that
we are asking what is responsible for the change
in the dependent variable.
Researchers must show that the IV caused the change in behavior and not something else. Results due to the independent variable and not other variables
In an experiment, the researcher tries to eliminate the effects (or control for) extraneous variables - other variables in the study.
If there are extraneous variables, you cannot tell if those variables or the IV (or both) influenced behavior.
Extraneous/Confounding variables that may have influenced results are threats to internal validity
Construct Validity
Construct validity defines how well a test or experiment measures up to its claims. It refers to whether the operational definition of a variable actually reflect the true theoretical meaning of a concept.
The simple way of thinking about it is as a test of generalization, like external validity, but it assesses whether the variable that you are testing for is addressed by the experiment.
For example, you might design whether an educational program increases artistic ability amongst pre-school children. Construct validity is a measure of whether your research actually measures artistic ability, a slightly abstract label.
Construct validity is an assessment of how well you translated your ideas or theories into actual programs or measures.
Construct validity defines how well a test or experiment measures up to its claims. A test designed to measure depression must only measure that particular construct, not closely related ideals such as anxiety or stress.
Construct validity determines whether the program measured the intended attribute.
“Given there is a valid causal relationship, is the
interpretation of the constructs involved in that
relationship correct?”
Suppose I am doing a study on the impact of font size and face on usability of Web pages by the elderly. If I conduct a study in which I vary Web page default font size (10 pt, 12, pt, 14 pt, 16 pt) and face (serif, sans-serif) and then measure the time of first page pull to 1 minute after last page pull by a group of people in an assisted living facility, I have two sorts of generalizability concerns.
One, called construct validity, is how do I get from the particular units, treatments, and observations of my study to the more general constructs they represent. That is, is this study useful in answering the question I really want to get at, which is, if we make adaptations to Web pages that take into account the physical limitations associated with aging, will people spend more time on a Web site? Do these specific operationalizations tap the actual constructs (page design, time spent on the site) whose causal relationship we are seeking to understand?
External Validity
A variable threatens external validity if it threatens generalizability of results
“To what populations, settings, and variables can this effect be generalized”
External validity is related to generalizing.
Do the results of the experiment apply to other people [Population Validity] or to other situations [Ecological Validity]
Threats to external validity are any characteristics of the study that limits the generality of the results.
Statistical Conclusion Validity
Were the appropriate use of statistics to infer whether the presumed independent and dependent variable covary.
Was the original statistical inference correct?
Not concerned with the causal relationship
between variables, but whether or not there is any
relationship, either causal or not.
Did the investigators arrive at the correct
conclusion regarding whether or not a relationship
between the variables exists or the extent of the
relationship?
The proper use of statistics to make inferences about :
The nature of the covariation between variables
The strength of that relationship.
Threats to statistical conclusion validity: improper use of statistics to make inferences about the nature of the covariation between variables (e.g., making a type I or type II error) and the strength of that relationship (mistakenly estimating the magnitude of covariation or the degree of confidence we can have in it)
Recommended that statistical hypothesis test reporting be supplemented with reporting of effect sizes (r2 or partial eta2), power and confidence intervals around the effect sizes.
Threats to Statistical Conclusion Validity: Low Statistical Power
An insufficiently powered experiment may incorrectly conclude that the relationship between treatment and outcome is not significant.
In particular, a small sample size may have insufficient power to detect a real effect even if it is there. As a result, the researcher claims the manipulation had no effect when in fact it does; he just couldn’t pick it up. As well, different statistical tests have varying sensitivity to detect differences.
Power analysis has the purposes of deciding how large a sample size you need to get reliable results, and how much power you have to detect a significant covariation among variables if it in fact exists.
Beyond a certain sample size the law of diminishing returns applies and in fact if a sample is large it can “detect” an effect that is of little real-world significance (i.e., you will obtain statistical significance but the amount of variation in DV explained by IV will be very small).
Example of low power problem: failing to reject the null hypothesis when it is false because your sample size is too small. So suppose there is in fact a significant increase in side effects associated with higher doses of a drug, but you did not detect it in your sample of size 40 because your power was too low; doctors will then go ahead and prescribe the higher dose without warning their patients that they could experience an increase in side effects. You could deal with this problem by increasing the sample size and/or setting your alpha region error rate to a larger area than .05, for example .10 or .20
The power to detect an effect is a complicated product of several interacting factors such as measurement error, size of the predicted effect, sample size, and Type 1 error rate.
Threats to Statistical Conclusion Validity: Violated Assumptions of Statistical Tests
Violations of statistical test assumptions can lead to either overestimating or underestimating the size and significance of an effect.
Failing to meet the assumptions of the test statistic, for example, that observations within a sample are independent in a t-test, which might result in getting significant differences between two samples but the real difference is more attributable to other factors the subjects had in common such as being from the same neighborhood or SES rather than the treatment they were exposed to; violating other assumptions like equality of population means, interval level data, normality of populations with respect to the variable of interest, etc.
Threats to Statistical Conclusion Validity: Fishing and the Error Rate Problem
Repeated tests for significant relationships, if uncorrected for the number of tests, can artificially inflate statistical significance.
Type I Error rate when there are multiple statistical tests. What starts out as .05 with one test becomes a very large probability of rejecting the null hypothesis when it is in fact true with repeated consultations of the table of the underlying distribution (normal table, t, etc.). It’s not the done thing to correlate 20 variables with each other (or to do multiple post-hoc comparisons after an ANOVA) and see what turns up significant, then go back and write your paper about that “relationship”
Threats to Statistical Conclusoin Validity: Unreliability of Measurement
Unreliability of Measures: Measurement error weakens the relationship between two variables and strengthens or weakens the relationships among three or more variables.
Threats to Statistical Conclusion Validity: Extraneous Variance in the Experimental Setting:
Some features of an experimental setting may inflate error, making detection of an effect more difficult.
Threats to Statistical Conclusion Validity: Restriction of Range
Reduced range on a variable usually weakens the relationship between it and another variable. avoid dichotomizing continuous measures (for example substituting “tall” and “short” instead of actual height; using dependent variables where the distribution is highly skewed and there are only a few cases in one or the other ends of the scale
Threats to Statistical Conclusion Validity: Unreliability of Treatment Implementation
If a treatment that is intended to be implemented in a standardized manner is implemented only partially for some respondents, effects may be underestimated compared with full implementation. Lack of standardized implementation of the treatment or level of the independent variable (we talked about this before in terms of things like instructions being memorized over time, experimenter effects, etc) unless adaptive application of the treatment is a more valid instantiation of how the treatment would occur in the real world.
Threats to Statistical Conclusion Validity: Heterogeneity of Units
Increased variability on the outcome variable within conditions increases error variance, making detection of a relationship more difficult.
Within-subjects variability: In most analyses that look at effects of treatments you are going to want your between-treatment variability to be large in accordance with your research hypothesis, and if there is a lot of variability among the subjects within the treatment that may make it more difficult to detect the predicted effect. Trade-off between ensuring subject homogeneity within treatments, which increases power to detect the effect, and possible loss of external validity,
Threats to Statistical Conclusion Validity: Inaccurate Effect Size Estimation
Some statistics systematically overestimate or underestimate the size of an effect.
Inaccurate effect-size estimation; recall how we talked about how the mean is affected by outliers. Sometimes there are some extreme cases or outliers that can adversely affect and perhaps inflate the estimates of effect sizes (differences on the DV attributable to the treatment or levels of IV)
Threats to Internal Validity: Extraneous Effects & History
Are participants exposed to events, other than the treatments, whose effects on their behavior could obscure the effects of the independent variable?
Events that happen to participants during the research which affect results but are not linked to the IV. In an extended study comparing relaxation to no relaxation on headache occurrence, those in the no relaxation condition sought out other means of reducing their headache occurrence (e.g. took more pills).
Any events which intervene between the treatment and the outcome measure. Example; subjects are presented with anti-smoking messages but are allowed a break before completing the post-test and various events happen during their break such as seeing smokers who are/are not attractive role models, etc. More of a problem in studies which assess effects over long periods of time
Environmental variables –
Features of the environment that that may influence results.
E.g., Room condition – bright, cheery vs. dark, small.
Threats to Internal Validity: Statistical regression effects (regression to the mean)
Regression toward the mean: the tendency of extreme (very high or very low) scores to fall closer to the mean on re-testing. Could changes in participants’ responses to the measures be caused by this? Regression to the mean occurs because measures are not perfectly correlated with each other. For example: In any given sample the tallest person is not always the heaviest, nor is the lightest person always the shortest.
Regression towards the mean
group of people who are moderately depressed – start a new type of therapy and see that when tested later, report fewer depressive symptoms- could be regression towards the mean.
Likely to be a problem in quasi-experiments when members of the group were selected (self- or administratively-) based on having high or low scores on the DV of interest. Testing on a subsequent occasion may exhibit “regression to the mean” where the once-high scorers score lower, or the once-low scorers score higher, and a treatment effect might appear when there really isn’t one. Having a really high score on something (like weight, cholesterol, blood sugar) etc might be sufficient to motivate a person to self-select into a treatment but the score might fall back to a lower level just naturally or through simply deciding to “get help,” although it could be attributed to the effects of the treatment.
Threats to Internal Validity: Attrition
Do participants drop-out of the groups during the study in a systematic or selective way? This could create differences among groups that would obscure the effects of the independent variable.
More of one type of person may drop out of one of the groups. For example, those less committed, less achievement-oriented, less intelligent.
Selective dropping out of a particular condition or level of the independent variable by people who had the most extreme pre-test scores on the DV, so when they drop out it makes the post-test mean for that condition “look better” and as if that treatment had a stronger effect since its mean would be lower without the extreme people.
Threats to Internal Validity: Interaction of temporal and group composition effects
Could changes in the participants’ behavior over time that are related to pre-existing differences among groups obscure the effects of the independent variable?
(more of a problem in correlational studies than in experiments in which you expose respondents to the treatment and then measure the outcome)
Threats to Internal Validity: Group composition effects (selection)
If different groups are used to compare the effects of treatments, could pre-existing differences among the groups obscure the effects of the independent variable?
Occurs when more of one type of person gets into one group for a study. For example, the people who return your questionnaire may be different, in some important way, to the people who did not return your questionnaire. The students who volunteer for your project might be different to the ones who do not volunteer (for example, more altruistic, more achievement oriented, more intelligent). Do these variables have an effect on the thing you are trying to measure? We usually do not know.
Threats to Internal Validity: maturation; fatigue
Do the participants change with the passage of time in ways unrelated to the effects of the independent variable?
Passage of time may have affected results
Study effects of drug A on learning in rats
Test one drug at 3 months age, another when rats are 1 yr old. See different effects of drugs – could be due to age differences.
Threats to External Validity: Nonrepresentative sampling
Are the participants in the research study so unrepresentative of those people who need to be understood? This would preclude generalization of the research results from the former to the latter.
Threats to External Validity:Nonrepresentative research context
Is the context in which the research study was carried out so unrepresentative of contexts where the behavior in question takes place as to preclude generalization of the research results from the former to the latter?
Validity
Validity: Am I measuring what I intend to measure?
Validity = Accuracy of Results determined by quality of research.
Content Validity
Content validity, sometimes called logical or rational validity, is the estimate of how much a measure represents every single element of a construct.
For example, an educational test with strong content validity will represent the subjects actually taught to students, rather than asking unrelated questions.
The relevance of an instrument to the characteristics of the variable it is meant to measure is assessed by face validity - the researcher’s subjective assessment of the instrument’s appropriateness - and sampling validity - the degree to which statements, questions, or indicators constituting the instrument adequately represents the qualities measured.
Content validity addresses the match between test questions and the content or subject area they are intended to assess. This concept of match is sometimes referred to as alignment, while the content or subject area of the test may be referred to as a performance domain.