Kap 1 - Introduction to data Flashcards
(39 cards)
Randomized Experiment
When individuals are randomly assigned to a group.
Anacdotal evidence
Lítið av data, kann vera . Um man hevur eitt data og persónurin doyr eftir at hava tikið medisinið, kann tað vera ein ekstrem case av medisinum, meðan tað kann gott vera eitt gott medisin alíkavæl.
Be careful of data collected in a haphazard fashion. Such evidence may be true and veriable,
but it may only represent extraordinary cases.
What is a summary statistic?
A summary statistic is a single number
summarizing a large amount of data.
What is variables?
Columns represent characteristics, called variables.
What is a data matrice?
A table. A convenient and common way to organize data, especially if collecting data in a spreadsheet. Each row of a data matrix corresponds to a unique case
observational unit), and each column corresponds to a variable.
What is a case or observational unit?
Ein rekkja í einari talvu.
What is a numeric variable?
numerical variable since it can take a wide
range of numerical values, and it is sensible to add, subtract, or take averages with those values.
What is a discrete variable?
Discrete variables are whole integers in a specific range.
What is a continuous variable?
It is numeric and can be all numbers.
What is a categorical variable?
Usually categories. Usually text. The possible values in a category is called the variable’s levels.
What are ordinal and nominal variables?
An ordinal variable is a categorical variable but the levels have a natural ordering, while a regular categorical variable without this type of special ordering is called a nominal variable.
What is a scatterplot?
Scatterplots are one type of graph used to study the relationship between two numerical variables.
What is it called when two variables show some connection with one another?
When two variables show some connection with one another, they are called
associated variables. Associated variables can also be called dependent variables and vice-versa.
What is it called when two variables don’t have a connection with one another?
Independent
explanatory and response variables?
When we suspect one variable might causally affect another, we label the frst variable the
explanatory variable and the second the response variable.
For many pairs of variables, there is no hypothesized relationship, and these labels would not
be applied to either variable in such cases.
What is an observational study?
Researchers perform an observational study when they collect data in a way that does not
directly interfere with how the data arise. For instance, researchers may collect information via
surveys, review medical or company records, or follow a cohort of many similar individuals to form
hypotheses about why certain diseases might develop. In each of these situations, researchers merely
observe the data that arise. In general, observational studies can provide evidence of a naturally
occurring association between variables, but they cannot by themselves show a causal connection.
observational studies are generally only sucient to show associations or form
hypotheses that we later check using experiments.
What is an experiment?
When researchers want to investigate the possibility of a causal connection, they conduct an
experiment. Usually there will be both an explanatory and a response variable. For instance, we
may suspect administering a drug will reduce mortality in heart attack patients over the following
year. To check if there really is a causal connection between the explanatory variable and the
response, researchers will collect a sample of individuals and split them into groups. The individuals
in each group are assigned a treatment.
What is a randomized experiment?
When people are randomly picked into a group of an expermient
What is a placebo?
Fake treatment.
What is a sample
Often times, it is too expensive
to collect data for every case in a population. Instead, a sample is taken. A sample represents
a subset of the cases and is often a small fraction of the population. For instance, 60 swordsh
(or some other number) in the population might be selected, and this sample data may be used to
provide an estimate of the population average and answer the research question.
What is the problem with selecting samples by hand?
When selecting samples by
hand, we run the risk of picking a biased sample, even if their bias isn’t intended.
What is a simple random sample?
The most basic random sample is called a simple random sample, and which is
equivalent to using a raffle to select cases. This means that each case in the population has an equal
chance of being included and there is no implied connection between the cases in the sample.
What is the problem if the non-response rate is high?
if only 30% of the people randomly sampled for
a survey actually respond, then it is unclear whether the results are representative of the entire
population. This non-response bias can skew results.
What is a convenience sample?
where individuals who are easily accessible
are more likely to be included in the sample.