non highlighted Flashcards

(53 cards)

1
Q

categorical variable

A

a categorical variable is placed an individual into one of several groups or categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

quantitative variable

A

a quantitative variable has numerical values and it makes sense to find the average value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

association

A

there is an association between two variables if knowing the value of one variable helps predict the value of the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

mean

A

average value of the observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

median

A

midpoint of the values, also called Q2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

first third quartiles

A

Q1 has about one-fourth of the observations below it, and Q3 has about three fourths of the observations below it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

interquartile range

A

IQR is the range of middles 50% of the observations IQR =Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

standard Deviation

A

measures the typical distance of the values in a distribution from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

variance

A

average squared deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

shape

A

typical shapes of a distribution are roughly symmetric, skewed left and skewed right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

center

A

mean for roughly symmetric distributions, median for skewed distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

spread

A

standard deviation for roughly symmetric distributions, IQR for skewed distributions. Range = man-min as a last resort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

transforming data by add/subtract a

A

measure of center (median and mean) and location (quartiles and percentiles) change by a measure of spread don’t change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

transforming data by multiply/ divide b

A

measure of center, location, and spread change by b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Density curve mean and median

A

the mean is the balance point of the curve. The median divides the area under the curve in half

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

uniform distribution

A

a distribution that takes constant height over some interval of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

68-95-99.7 rule

A

percent of observations that lie within one tow and three standard deviations of the mean in a normal curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

normal probability plot

A

if the normal probability plot is roughly linear, then the data is apporiximately normal
if the normal probability is not roughly linear then the data is not approximately normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

scatterplot

A

displays the relationship between two quantitative variables measured on the same individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

explanatory variable, factor, response variable

A

if we think that a variable x may help explain, predict or even cause changes in anohter variable y, we call x an explanatory variable and y a response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

correlation r

A

meaures the direction and strength

r has no units, is between -1 and +1 and is not the value of the slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

correlation and causation

A

correlation does not imply causation, no matter how strong there may be other confounding variables

23
Q

least squares regression line

A

the straight line y hat = a+bx that minimizes the sum of the squares of the veritcal distances of the observed points from the line

24
Q

slope b

A

b is the predicted change in y when x increases by 1 unit in context

25
y intercept a
predicted resonse y hat value when the explanatory variable x equals 0, in context
26
extrapolation
avoid extrapolation the use of a regression line for prediction using values of the explanatory variable outside the range of the data
27
residual
y- y hat the difference between the observed and predicted values of y
28
influentials
outliers that substantially change the correlation or the regression line's slope or y intercept
29
census
census collects data from every individual in the population
30
convenience sample
choose individuals who are easiest to reach
31
voluntary response sample
individyals choose to join the sample in response to an open invitation key terms phone in survey, TV survey
32
simple random sample
SRS uses chance prosses to give every possible sample of a given size the same chance to be chosen. choose an srs by labeling the members of the population and using slips of paper, technology or random digits table to select the sample
33
stratified random sample
divide the population into strata, groups of individuals that are similar in some way that might affect their responses. Then choose a separate SRS form each stratum and combine these SRSs to form the sample
34
cluster sample
divide the population in clusters, groups of individuals that are located near each other. Randomly select some of these clusters.. All the individuals in the chosen clusters are included in the sample
35
undercoverage
when some members of the population cannot be chosen to be in the sample
36
reponse bias
when a systematic pattern of inaccurate answers leads to resonse bias
37
nonresponse bias
when people can't be contacted or refuse to answer
38
wording bias
wehn confusing or leading questions introduce stron gbias
39
observational study
gathers data on individuals as they are
40
experiment
deliberatly imposes treatments on experimental units
41
experimental units
each of the individuals to which treatments are applied. Human experimental units are called subjects
42
confounding
variables are confounded whe their effects on a response variable can't be distinguished from that of the explanatory variable
43
completely randomized design
all experimental units are assigned to the treatments completely by chance
44
placebo
a fake treatment for the control group. That prevents confounding due to the placebo effect, in which some patients get better because they expect the treatment to work.
45
double blind experiment
neither the subjects nor those interacting with them and measuring their responses know who is receiving which treatment. If one party knows and the other doesn't then the experiment is single blind
46
randomized block design
use blocks of experimental units that are similar with respect to a variable that is expected to affect the response. Treatments are assigned at random within each block. Responses are then compared within each block and combined with the reponses of other blocks after accounting for the differences between the blocks
47
matched pairs design
in some matched paris designs, each subject receives both treatments in a random order. in others, two very similar subjects are paired, and the two treatments are randomly assigned within each pair
48
mutually exclusive and independence
if two events are mutually exclusive, they cannot also be independent
49
probabilty distribution
the probabilty distribution of a random variable gives its possible values with gaps between
50
continuos random variable
a continuous random variable x takes all values in an interval of numbers. the probability distribution of x is described by a density curve. The probability of any event is the area under the density curve and above the values of x that make up the even
51
population parameter/ sample statistics
a parameter is a number that descrives a population. To estimate an unknown parameter, use a statistic calculated from a sample
52
sampling distribution
the sampling distribution of a statistic a statistic describes the values of the statistic in all possible samples of the same size from the same population
53
unbiased estiamator
a statistic is an unbiased estimator if the center (mean) of its sampling distribution is equal to the true value of the parameter