Quantitative Methods Flashcards
Alpha Coefficient
The alpha coefficient (also referred to as Cronbach’s alpha) is a measure of internal consistency reliability (how closely related a set of items are as a group). It is used in survey design. Alpha is concerned with the degree to which different items appear to be tapping a single underlying construct (as suggested by the correlation/covariance between items on a scale). Cronbach’s alpha can be written as a function of the number of test items and the average inter-correlation among the items. Therefore, to increase alpha you can 1) add more items to the measure or 2) increase the homogeneity of items on the scale.
Alpha Level
The level of significance chosen before running a statistical test that denotes the probability of rejecting the null hypothesis (no relationship between two measured phenomena) when it is actually true. This probability should be small, because we do not want to reject the null hypothesis when it is true (type 1 error). Traditional values used for alpha are .05, .01, and .001. (Eg., When alpha is set at .05 it means that 5% of all possible outcomes will falsely reject the null hypothesis when it is actually true.) The alpha-level is chosen according to the researcher’s willingness to take that particular risk of claiming to find a significant “effect” when there is none.
Attributable Risk (Risk Difference)
It is the amount of risk of disease attributable to exposure in an epidemiological study. This measure is derived by subtracting the incidence rate of the outcome among the unexposed from the rate of the outcome among the exposed individuals (thereby measuring the difference between the disease risk in exposed and unexposed populations). Two assumptions must be made: 1) everyone can be categorized as either exposed or non-exposed; 2) a cause-effect relationship exists between the exposure and disease.
Autocorrelation
Autocorrelation (also known as serial correlation) is the correlation of of a variable with itself over successive observations. For example, if we are predicting the growth of stock dividends, an overestimate in one year is likely to lead to overestimates in succeeding years. In panel studies, the error terms are autocorrelated when the current values of a variable (e.g., political attitude) are correlated with previous values (e.g., religion). Autocorrelation can appear with time-series data (e.g., repeated measures) and with cluster sampling (e.g., surveying multiple individuals in a family/household). We hope for a situation in which, for any two observations, the error terms are uncorrelated with each other.
BLUE
In using survey / sample data in regression, we want to look for items whose population parameters are the best, linear, and unbiased in order to increase the accuracy of the overall regression. Best—the property (of a sample estimate) of having the smallest deviation from the population parameter. Linear—the slope is constant. Unbiased—an estimator of the population parameter is unbiased if its mean value over all possible random samples is equal to the parameter being estimated. Estimator—an estimated value (calculated from sample data) of some population parameter of interest.
Bounded Recall
A survey method used to improve recall and reduce telescoping error (tendency of respondents to report events as happening earlier or later than they actually occurred) during the retrieval stage of response formation. Respondents are given a specific point of reference (generally the last survey response) to help jog their memories and prevent the compression of time in their responses. It is a primary device to reduce over-reporting of health events in continuous panel surveys (longitudinal studies of the same sample of people at different times) However, it does not necessarily help to reduce the other kind of recall error: omission/forgetting.
Case-control Study
A study design in epidemiology most often used when an outcome or disease is rare. Cases are selected based on their disease (or other outcome) status; controls are those without the outcome variable, but who resemble the cases in other factors (should be selected from the same source population as the cases). The point of this type of study is to look back in time to understand what the risk or exposure was that caused the cases to develop the disease in comparison to the controls (in order to generate a measure of association between disease and exposure status). Used to estimate the odds ratio (cannot determine incidence, prevalence, or (generally) the probability of disease given exposure or non-exposure).
Causal Inference
The art and science of making a causal claim about the relationship between two factors. If an association between two factors/variables is observed, then causal inference can answer the question: is one causing the other? However, association does not necessarily imply causation. David Hume (early 1700s) was the first to make a systematic statement of cause and effect. His lasting contribution was to raise doubts about the possibility of proving causation, arguing that there is no certain logic that can prove universality of causal claims. Thus, our study results will often be inconclusive, and the best we can expect to do is to reach the most reasonable explanation based on the evidence at hand.
Causality
According to Selltiz et al. (1959), three conditions must be met in order to infer the existence of a causal relationship between two variables 1) temporal sequencing (cause precedes effect), 2) empirical correlation between independent and dependent variables (cause and effect covary with one another), and 3) no alternative plausible explanations for the effect other than the cause (eliminating confounding factors).
Censored Observation
Observations that are incomplete in some way, as when certain values are unknown or ignored, or when an observation is discontinued before the event of interest is observed to occur, resulting in incomplete follow up of the individual. For example, in a study of the effects of a cancer treatment on survival time, study participants who remain alive at the end of the study are considered censored observations because their full survival time is still unmeasured, and thus cannot be included in the analysis of the results. Their records become censored observations. This situation occurs in survival analysis.
Cohort
Types include prospective, retrospective and ambispective studies. The analytic method of epidemiologic study in which subsets of a defined population can be identified who are, have been, or in the future may be exposed or not exposed to a factor (or factors) hypothesized to influence the probability of occurrence of a given disease or other outcome. The basic feature is observation of a population (classified on the basis of exposure) for a sufficient number of person-years to generate reliable estimates of outcome incidence in the population subsets. A cohort study establishes temporal sequence between exposure and disease.
Period Effect
A period effect leads to cross-sectional changes in the observed incidence of a disease because of factors uniformly affecting all age groups and birth cohorts during a specific period of time, such as the effect from exposure to something in the air or drinking water. Often a period effect refers to an artifactual change in the reported disease rate, vs. a true change in the burden of that disease in the population. Ex.: the increased use of PSA tests artificially inflated the rates of prostate cancer (because diagnoses became more common, while actual prostate cancer did not). Such a shift in the sensitivity of techniques for detecting an outcome affects multiple age groups (those screened/observed, anyway), regardless of birth cohort.
Cohort Effect
A cohort is a group of people who share a common experience within a defined time period. Thus, a cohort effect will be concentrated in the group sharing a relationship to a particular event. Example: we might expect to observe a cohort effect of the WTC disaster on the incidence of clinical depression or alcoholism among firefighters working in the NYC area during fall of 2001.
Cohort Effect vs. Period Effect
Both terms describe the temporal effects on societal patterns of disease. Period effect refers to the effects of contemporary societal change in trends of disease due to factors uniformly affecting all age groups and birth cohorts during a specific period of time, such as the effect from exposure to something in the air or drinking water. A period effect changes a disease rate for a limited period of time around the time of its occurrence. (ex: Higher disease rates in the 1930s due to the Great Depression). Cohort effect will be concentrated in the group sharing a relationship to a particular event. For example, we might expect to observe a cohort effect of the WTC disaster on the incidence of clinical depression or alcoholism among firefighters working in the NYC area during fall of 2001.
Confidence Intervals
The range of values (bounded interval) within which a population parameter is estimated to lie. This depends on the point estimate, its variability, and the sample size. CI’s can be thought of as an enhancement of the point estimate because rather than just calculating a single number that’s intended to estimate the value of a population parameter, we can also calculate a range of values within which the parameter is likely to fall. Therefore, the CI reflects our best guess of a parameter (the point estimate) as well as the precision of this guess. Ideally, you want the CI to be reasonably narrow. A CI at 95% indicates that if you sample the same population repeatedly over time, 95% of the the CIs for the point estimate would include the point estimate.
Confounding Factors
A confounding factor is an extraneous variable that covaries with both the independent and dependent variable of interest. It is a plausable alternative explanation for the relationship between the variables being studied. Unlike a mediating variable, it lies outside of the causal pathway between IV and DV, threatening internal validity. You want to control for confounders, but not for mediators. For example, smoking, positively associated with both alcohol intake and cardiovascular disease, can confound the association between drinking and CVD. Thus, we would want to control for smoking in any equation used to model the relationship between drinking alcohol and risk for cardiovascular disease.
Moderating vs. Mediating Factors
A moderator variable is one that influences the strength of a relationship between two other variables, and a mediator variable is one that explains the relationship between the two other variables. Whereas moderator variables affect the direction and/or strength between an IV and DV, mediators speak to how or why such effects occur. As an example, let’s consider the relation between social class (SES) and frequency of breast self-exams (BSE). Age might be a moderator variable, in that the relation between SES and BSE could be stronger for older women and less strong or nonexistent for younger women. Education might be a mediator variable in that it explains why there is a relation between SES and BSE. When you remove the effect of education, the relation between SES and BSE disappears.
Content Validity
The validity of the instrument itself. In survey research, content validity relates to how accurately the scale operationalizes the latent construct(s). Poor content validity means that you are measuring something other than what you had intended to measure. Is the instrument really measuring the concept or idea indicated? Construct validity is directly concerned with the theoretical relationship of a variable (e.g. one that is measured by a scale) with other variables.
Contamination
The polluting of one’s experimental groups with outside influences – often unexpected and uncontrolled for. The term contamination often refers to a diffusion of the treatment of interest from experimental to control groups. Contamination is especially likely to occur where one treatment is more desirable than another treatment or than the control state, and the desired treatment can be obtained through sources other than the investigators or their service delivery staff.
Correlation vs. Causation
Correlation refers to the degree to which variables change (covary) together. Researchers have determined that correlation is not synonymous with causation. ‘A’ can only be said to cause ‘B’ if 1) A is prior to B; 2) change in A is correlated with change in B; and 3) this correlation is not itself the consequence of both A and B being correlated with some prior variable C.
Cost-benefits Analysis
A type of efficiency analysis in program evaluation that assesses the relationship between (direct and indirect) program costs and their (direct and indirect) benefits in monetary terms. Example: an anti-smoking campaign saved $1000 in healthcare costs for every $100 in project costs. The major issue with CBA is the difficulty (methodologically and philosophically) in placing a dollar value on non-monetary program benefits.
Cost Effectiveness Analysis
A type of efficiency analysis in program evaluation that assesses the non-monetary outcomes of an intervention in relation to a program’s input costs. It is expressed as a ratio of cost per unit of impact, or in other words, the cost of achieving a specific result. Example: an anti-smoking campaign caused one person to quit smoking for every $1,000 in project costs. This type of analysis is particularly helpful when comparing multiple programs that seek to achieve the same outcomes, or when monetizing program outcomes (for cost-benefit analyses) would be too difficult.
Counterfactual Condition
A counterfactual is something that is contrary to fact. In an experiment, we observe what did happen when people received a treatment. The counterfactual is knowledge of what would have happened to those same people if they simultaneously had not received treatment. An effect is the difference between what did happen and what would have happened. We cannot actually observe a counterfactual since it is impossible for respondents to both have and not have the causal condition simultaneously. Therefore, a central task of cause-probing research is to create reasonable approximations to the physically impossible counterfactual. (Shadish, Cook, & Campbell)
Criterion Validity
The extent to which a survey measure or scale predicts or agrees with some criterion or “gold standard” of the measure. Includes predictive validity (correspondence between the measure and a future criterion), concurrent validity (correspondence between the measure and a current criterion), and postdictive validity (correspondence between the measure a previously established criterion). Ex.: SATs demonstrate predictive validity to the extent that SAT scores correlate positively with college GPA, while patient reports of STI exposure demonstrate concurrent validity to the extent that these reports match up with records of lab tests of their disease status.
Cronbach’s Alpha
A measure of internal consistency reliability for a scale. It is defined as the proportion of a scale’s total variance that is attributable to a common source, presumably the true score of a latent variable underlying the items. It examines the homogeneity of items through examining inter-item covariances/correlations. It measures to what proportion items are measuring the same thing. It is based on all possible ways of splitting and comparing sets of questions used to tap into a particular construct. The widely accepted cutoff for alpha is 0.70. 0.80 or better represents good reliability.
Discriminant Validity
The degree to which the operationalization of a target construct diverges from operationalizations of other constructs to which it is conceptually similar or empirically associated. It is an aspect of construct validity. It flows from the notion that a measure of A can be discriminated from a measure of B when B is thought to be different than A. For example, if one is measuring a neighborhood’s collective efficacy, it would strengthen the study to show that the measurement is tapping something different than what a social capital measure would tap.
Double-barrel Question
A question that groups two different topics or constructs into a single question, making the question multidimensional. For example, “Do you want to be rich and famous?” or “How difficult is it for adolescents to get and use birth control?” These types of questions violate the rule that closed-ended questions should be unidimensional (i.e., they ask about only one topic at a time). Double-barreled questions produce ambiguous answers and contribute to measurement error because the researcher cannot disentangle which question the respondent actually answered.
Ecological Fallacy
A problem of inference when a relationship at the aggregate is assumed to hold at the individual level. Example: Party affiliation for a state used to determine individual voting behavior. It assumes that individuals in the study group will have the average characteristics of that group. The problem lies in the false assumption that a correlation between group characteristics (e.g., religion in a geographically defined population and incidence of suicide in that population) will be reproduced at the level of the individual (e.g., individual Protestants will be more likely to commit suicide than Catholics). This is a problem that can arise when one uses inappropriate units of analysis (with respect to the units of observation). For example, you cannot safely draw conclusions about individual voters based on data regarding their precincts; to do so would be to perpetuate an ecological fallacy.
Effect Size
Effect size is a name given to a family of indices that measure the magnitude of a treatment effect. This includes the standardized mean difference statistic, the odds ratio, the correlation coefficient, the rate difference, and the rate ratio. Unlike significance tests, these indices are independent of sample size. ES measures are the common currency of meta-analysis studies that summarize the findings from a specific area of research.
Endogeneity
Variables influenced by other variables in a model. In path diagrams all endogenous variables have error terms associated with them because there are almost always causes of an observed score on an endogenous variable other than the exogenous causes modeled, including random measurement error, and omitted causes not included in the study. We can think of endogenous variables as dependent variables, but it is important to remember that an endogenous variable can also predict other variables in the model.
Expectation
The mean of a statistic based on repeated sampling. The expected value of a random variable, denoted as E(X). For a random variable, it is the integral of the random variable with respect to its probability measure. Intuitively, it is the long-run average: if a test could be repeated many times, expectation is the mean of all the results.
External Validity
Concerns whether inferences/conclusions hold over variations in persons, settings, treatments, and outcomes (generalizability). This is one of Cook and Campbell’s 4 “types” of validity, and it is the highest in the hierarchy. This means that it should be considered as a final objective, after statistical conclusion, internal, and construct validity have already been reasonably well established.
Formative Evaluation
Evaluation activities that assess the conduct of programs during their early stages. These activities are undertaken to furnish information that will guide program improvement, and/or to begin developing the measures and instruments that will permit ongoing evaluation of the program to be implemented. Thus, formative evaluation can shape or form the program to improve its performance, and is usually desired by program evaluators who will be called upon later to demonstrate that a program has met its goals and objectives. Formative evaluation can be focused on program development, targeting and structural issues, or can be conducted like mini-impact evaluations. It may include testing/assessing a program at certain sites, or with a small sample of targets, prior to full implementation. This often allows opportunities to pretest evaluation procedures, as well as the intervention itself. (Rossi & Freeman)
Goodness-of-fit Statistic
Describes how well a statistical model fits a set of observations. Common GOF statistics are Pearson’s statistic and the Likelihood-Ratio Statistic. The GOF statistic summarizes the discrepancy between observed values and the values expected under the model in question. It indicates the variance explained by the chosen model. In other words, it estimates how well the observed data fit the pattern predicted by the explanatory model or how well does the model predict what it is supposed to predict.
Heterogeneity
Refers to variance of responses, respondents, conditions, or treatments. When there is heterogeneity of units within conditions of an outcome variable, the standard deviations on that variable, and any others correlated with it, will be greater. Heterogeneity is a threat to statistical conclusion validity because it can obscure systematic covariation between treatment and outcome.
Heteroscedasticity
In statistics, a group of random variables is heteroscedastic when there are sub-populations with different variances. One of the Ordinary Least Squares regression assumptions is that the error term has a constant variance, or is homoscedastic. Heteroscedasticity can result from things such as measurement error in the dependent variable and from interaction between included and excluded independent variables. When graphed, heteroscedasticity generally looks like a megaphone around the regression line.
Human Subjects
A living individual about whom an investigator conducting research obtains 1) data through intervention or interaction with the individual, or 2) identifiable private information (US Federal Guidelines). Research involving human subjects must gain approval from an Institutional Review Board, which considers the basic ethical principles of respect for persons, beneficence, and justice. These principles are applied through the use of informed consent procedures (voluntary participation), assessment of risks and benefits, ability of participant to choose to be excluded at any time, participants being informed of study design (Belmont Report).
Impact Evaluation
An evaluative study that answers questions about program outcomes and impact on the social conditions it is intended to ameliorate. Also known as impact assessment or outcome evaluation. Impact evaluation requires methods for separating the treatment effect (the effect of interest) from confounding effects. Evaluation of impact is a part of the evaluation process that attempts to attribute the desired outcomes of program to the program itself. The objective of impact evaluations is to determine the net impact of a program on the outcome intended by the program.
Indirect Effect
In path analysis and causal modeling, the relationship between X and Y is said to be indirect if X causes Z which in turn causes Y. A predictor variable has an indirect effect on the outcome if there is another variable on the causal path between the first predictor variable and the outcome of interest.
Informed Consent
The process of giving prospective study participants the information they need to decide whether or not they want to participate in a study given its risks and benefits. U.S. federal guidelines indicate that there are a number of pieces of information that a human subject must be given in order to provide informed consent including limits of confidentiality, a statement that participation is voluntary and can be stopped at any time, a description of the purpose of the research and the procedures to be followed, and a description of any foreseeable risks.
Instrumental Variables
In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment.
Intention to Treat
A method of analysis for randomized trials in which all patients randomly assigned to one of the treatments are analyzed together, regardless of whether or not they completed or received that treatment. This analysis preserves the benefits of random assignment for causal inference but yields an unbiased estimate only about the effects of being assigned to treatment, not of actually receiving treatment. The inference yielded by the intent to treat analysis is often of great policy interest because if a treatment is implemented widely as a matter of policy, imperfect treatment implementation will occur. So the intent to treat analysis gives an idea of the likely effects of the treatment-as-implemented in policy.