Exam 1 Flashcards

(63 cards)

1
Q

Statistics

A

collect, analyze, and organize numbers, inference from sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

uncertainty

A

how sure we are of our answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

margin of error

A

used to talk about uncertainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

variables

A

items represented by letters or symbols

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

quantitative variables

A

all entries are numerical values (height, weight, etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

categorical variables

A

names or categories (eye color, car type, gender, etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

predictor

A

use one or more variables to predict or explain a change in another variable (cause, X: x1, x2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

response

A

what variable is predicted (effect, y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

observation

A

the person or thing the variables are measured on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

observational study

A

values are observed, researcher doesn’t manipulate anything

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Experimental study

A

researchers assign members to experimental conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

population

A

entire overall group we are interested in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

samples

A

data, subgroup from the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

census

A

whole population statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

parameter

A

a numeric characteristic pertaining to a population, never know true value, estimation, denoted with greek letter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

statistic

A

any number you calculate using data, often used to estimate parameters from a known sample, denote with english letters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

statistical inference

A

generalize from a sample to a population - use statistic to estimate value of a parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

relative frequency

A

categorical data, observations of interest/total observations, always value between 0 and 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

location

A

where data is located on number line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

mean

A

average, identifies statistics center

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

median

A

middle data points (average middle if even number of points)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Quartiles

A

breaks data set into four equal sections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Q1

A

lower quartile, 25%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Q3

A

upper quartile, 75%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
extremes
minimum and maximum of data set
26
5 number summary
minimum, q1, median, q3, maximum (listed in order)
27
range
maximum - minimum
28
inner quartile range
Q3 - Q1
29
variance
S^2, average sum of squares
30
Sum of squares
distance between each data point and the mean, squared
31
deviation
difference between one observation and the mean
32
standard deviation
s, square root of the variance, standard amount observations vary from the mean
33
categorical data visualization
bar chart and pie chart
34
distribution
frequency with which values are taken on
35
histogram
distribution of a quantitative variable - bar represents number of observations that fall into an interval
36
outliers
data points located far away from the majority of the data
37
box plot
determine shape of distribution, visual representation of five number summary
38
right skewed / positive skewed distribution
lots of small values, mean greater than median
39
left skewed / negatively skewed distribution
lots of large values, mean less than median
40
symmetrical / no skew distribution
roughly mirror image, bell curve
41
Uniform distribution
every bar has roughly the same height, mean and median approximately equal
42
bias
occurs when a study is set up in a way that results will be systematically wrong instead of just by random chance
43
sampling bias
when sample is take in a way we expect it to differ systematically from the population of interest
44
seld-selection bias
occurs when individuals choose if they want to be in the sample or not
45
non-response bias
occurs when certain types of respondents are less likely to answer a survey and reason is related to variable being studied
46
simple random sample
to mitigate bias, samples should be collected randomly (every unit has equal opportunity to be selected)
47
confounding variable
influences both predictor and response variables, but is not accounted for in the study
48
establishing causation
association does not imply causation, causation can only be inferred from a randomized experiment
49
random events
events that are unpredictable
50
probability
a way of quantifying the chance some random event occurs (proportion between 1 and 0)
51
plausibility
reflects a state of knowledge / information / uncertainty
52
relative frequency
if this event was repeated over and over and over again, how often would this outcome occur
53
probability notation
p(x) is the probability that event x occurs, p(x) and p(not x) must sum to 1
54
conditional probability
probability of certain events occurring given that some other event occurs or has occurred - condition is after vertical bar, only focus on things related to condition
55
relationship
use conditional variables to check if two variables are related - check p(a|1) vs p(a|2), if large difference there is a relationship - response variable | predictor variable
56
simpson's paradox
occurs when trend in data reverses when the data is broken down into subgroups
57
percent change
(new-old) / old x 100
58
change in percentage points
new-old
59
z score
how many standard deviations a number is from the mean - positive: above the mean - negative: below the mean - closer to 0: more likely to occur
60
empirical rule
68/95/99.7 how many data points lie within 1,2,3 standard deviations of the mean
61
percentiles
the value of a variable for which the given percentage of values fall below it
62
equivalent
2 values have the same Z score
63