Kap 1 - Introduction to data Flashcards

Question

What is a confounding variable?

Answer 1

A variable that is correlated with both the explanatory and response variables. While one method to justify making causal conclusions from observational studies is to exhaust the search for confounding variables, there is no guarantee that all confounding variables can be examined or measured. Example: Some previous research tells us that using sunscreen actually reduces skin cancer risk, so maybe there is another variable that can explain this hypothetical association between sunscreen usage and skin cancer. One important piece of information that is absent is sun exposure. If someone is out in the sun all day, she is more likely to use sunscreen and more likely to get skin cancer. Exposure to the sun is unaccounted for in the simple investigation.

Answer 2

identies individuals and collects information as events unfold. For instance, medical researchers may identify and follow a group of patients over many years to assess the possible in uences of behavior on cancer risk.

Answer 3

Retrospective studies collect data after events have taken place, e.g. researchers may review past events in medical records.

Answer 4

simple, stratified, cluster, and multistage sampling

Answer 5

In general, a sample is referred to as \simple random" if each case in the population has an equal chance of being included in the nal sample and knowing that a case is included in a sample does not provide useful information about which other cases are included.

Answer 6

Stratied sampling is a divide-and-conquer sampling strategy. The population is divided into groups called strata. The strata are chosen so that similar cases are grouped together, then a second sampling method, usually simple random sampling, is employed within each stratum. In the baseball salary example, the teams could represent the strata, since some teams have a lot more money (up to 4 times as much!). Then we might randomly sample 4 players from each team for a total of 120 players. Stratied sampling is especially useful when the cases in each stratum are very similar with respect to the outcome of interest. The downside is that analyzing data from a stratied sample is a more complex task than analyzing data from a simple random sample. The analysis methods introduced in this book would need to be extended to analyze data collected using stratied sampling.

Answer 7

In a cluster sample, we break up the population into many groups, called clusters. Then we sample a xed number of clusters and include all observations from each of those clusters in the sample. Sometimes cluster or multistage sampling can be more economical than the alternative sampling techniques. Also, unlike stratied sampling, these approaches are most helpful when there is a lot of case-to-case variability within a cluster but the clusters themselves don't look very different from one another.

Answer 8

A multistage sample is like a cluster sample, (In a cluster sample, we break up the population into many groups, called clusters. Then we sample a xed number of clusters and include all observations from each of those clusters in the sample), but rather than keeping all observations in each cluster, we collect a random sample within each selected cluster. Sometimes cluster or multistage sampling can be more economical than the alternative sampling techniques. Also, unlike stratied sampling, these approaches are most helpful when there is a lot of case-to-case variability within a cluster but the clusters themselves don't look very different from one another.

Answer 9

Controlling, randomization, replication and blocking.

Answer 10

Researchers assign treatments to cases, and they do their best to control any other differences in the groups.27 For example, when patients take a drug in pill form, some patients take the pill with only a sip of water while others may have it with an entire glass of water. To control for the eect of water consumption, a doctor may ask all patients to drink a 12 ounce glass of water with the pill.

Answer 11

Researchers randomize patients into treatment groups to account for variables that cannot be controlled. For example, some patients may be more susceptible to a disease than others due to their dietary habits. Randomizing patients into the treatment or control group helps even out such differences, and it also prevents accidental bias from entering the study.

Answer 12

The more cases researchers observe, the more accurately they can estimate the effect of the explanatory variable on the response. In a single study, we replicate by collecting a suffciently large sample. Additionally, a group of scientists may replicate an entire study to verify an earlier finding.

Answer 13

Researchers sometimes know or suspect that variables, other than the treatment, influence the response. Under these circumstances, they may rst group individuals based on this variable into blocks and then randomize cases within each block to the treatment groups. This strategy is often referred to as blocking. For instance, if we are looking at the eect of a drug on heart attacks, we might rst split patients in the study into low-risk and high-risk blocks, then randomly assign half the patients from each block to the control group and the other half to the treatment group, as shown in Figure 1.16. This strategy ensures each treatment group has an equal number of low-risk and high-risk patients.

Answer 14

When researchers keep the patients uninformed about if they are in the treatment- or the control group.

Answer 15

The patients are not the only ones who should be blinded: doctors and researchers can accidentally bias a study. When a doctor knows a patient has been given the real treatment, she might inadvertently give that patient more attention or care than a patient that she knows is on the placebo. To guard against this bias, which again has been found to have a measurable eect in some instances, most modern studies employ a double-blind setup where doctors or researchers who interact with patients are, just like the patients, unaware of who is or is not receiving the treatment.

Kap 1 - Introduction to data Flashcards

(39 cards)