Exam 1 Flashcards

(124 cards)

1
Q

A

sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

μ

A

population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

S

A

sample standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

σ

A

population standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

S2

A

sample variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

σ2

A

population variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

n

A

sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

N

A

population size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

statistics

A

science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions
- process, guessing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

population

A

the entire group to be studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

parameter

A

numerical summary of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

sample

A

a subset of the group to be studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

statistic

A

numerical summary of a sample

- easier and cheaper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

descriptive stats

A

organizing and summarizing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

inferential stats

A

methods that take a result from a sample, extend it to the population, and measure the reliability of the result
- prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

variable

A

characteristics of individuals within a population

  • age
  • ethnicity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

data

A

the list of observed values for a variable (age), a fact/proposition used to draw a conclusion

  • numeric/nonnumeric
  • describe characteristics of individual
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

lurking variables

A

outside force impacts change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

qualitative variables

A

categories, allows for classification based on an attribute or characteristics

  • can’t be added/subtracted
  • zip codes/favorite color
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

quantitative variables

A

numerical measure, data can be added/subtracted with meaningful results
- can add/subtract

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

discrete variable

A

quantitative variable that has a finite number of possible outcomes or a countable number of outcomes

  • countable
  • quantitative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

continuous variable

A

quantitative variable that has a finite number of possible outcomes

  • not countable
  • not exact
  • weights
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

nominal level of measurement

A

if the values of the variable name, label, or categorize qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

ordinal level

A

variable has properties of nominal level, but allows for values of variable to be arranged in a rank specific order, qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
interval level
variable has properties of ordinal and the differences in values of variable have meaning (zero has 0 meaning), quantitative - can use + or -
26
ratio level
variable has properties of interval and ratios of values of variable have meaning, quantitative
27
open ended
- 1st class has no LCL | - last class has no UCL
28
frequent distribution goal
reveal interesting features of data | - want 5-20 classes
29
frequency distribution
table that lists each category of data and the number of occurrences for each category of data (frequency)
30
category
first row
31
frequency
second row
32
relative frequency distribution
table that lists each category of data and its associated relative frequency
33
relative frequency
proportion (or %) of observations within a category, standardizes
34
rel freq formula
freq / sum of all freq
35
Σ
relative frequency distribution
36
bar graph
constructed by labelling each category/quality on one axis and the frequency/relative freq on the other. Ass rectangles of equal width for each category, where height is determined by the frequency/rel freq - bars don't touch
37
freq/rel freq distributions
similar to what we did for qualitative data
38
in bar graphs, we use ___ instead of categories
classes
39
for discrete data
the class can be a single value or an interval of values
40
for continuous data
the class is an interval of numbers
41
LCL
lowest # in a class
42
UCL
highest # in class
43
class width
``` how wide classes are - subtract 2 lower class limits ```
44
histogram
constructed by drawing rectangles for each rectangle is the freq of rel freq the width of each rectangle is equal - bars touch - for continuous - bars only go up
45
labelling the classes on the histogram
- label the left edge with LCL | - label the middle with the midpoints
46
distribution shapes
- uniform - symmetric - skewed left - skewed right
47
uniform
same throughout
48
symmetric
bell-shaped
49
skewed left
longer tail on left
50
skewed right
longer tail on right | - positively skewed
51
arithmetic mean
generally mean, average
52
population mean
mean calculated from population data
53
sample mean
mean calculated from sample data | - 1-var stats
54
median
value that lies in the middle of data set when in ASCENDING order
55
M
median
56
median calculation
- odd # = middle # - even # = mean of 2 middle numbers - 1-var stats
57
mode
most frequent observation that occurs in a data set | - can have none, one, or multiple
58
resistant measures
if extreme observations DON'T substantially affect it
59
mean ___ resistant
is not
60
median ____ resistant
is
61
range
the difference between largest and smallest value in data
62
R
range
63
STDEV
measure the dispersion of data relative to its mean
64
population STDEV
STDEV calculated from population data | - 1-var stats
65
sample STDEV
STDEV calculated from sample data
66
S
sample STDEV
67
degrees of freedom
subtract 1 to be a little more exact
68
emperical rule
for a BELL SHAPED curve | - used to approx the amount of data within K STDEV of mean
69
1 STDEV
68%
70
2 STDEV
95%
71
3 STDEV
99.7%
72
Variance
how far a data set is spread out
73
population variance
how data points in a specific population are spread out
74
sample variance
calculate how varied a sample is
75
chebychev's inequality
for ANY distribution, used to show that at least [1 - 1/(K)2) x 100%] of the observations lie within K STDEV of the mean, where K is greater than 1 (K>1) - to show 1/2
76
K
placeholder for STDEV
77
midpoint
a single value that provides the best representation for the observations within a class
78
For any class in a freq. dis. that is an interval, we first need to calculate the ___
midpoint
79
midpoint calculation
``` add 2 LCL and divide by 2 - last one increases by class width ```
80
weighted mean
a mean calculated for data values that has different weights associated with them (GPA) - 1-var stats (L1, L2)
81
x bar sub w
weighted mean
82
Z-score
how many stdev a value is away from mean - unitless standardizes data which allows us to compare
83
z
z-score
84
percentile
the Kth percentile of a data set is a value such that K percentage of observations are less than/equal to the value
85
P sub k
percentile
86
quartiles
most common percentiles
87
Q1
25th percentile
88
Q2
50th percentile
89
Q3
75th percentile
90
interquartile range
range of the middle 50% of observations in data set - Q3 - Q1 - 1-var stats
91
IQR
interquartile range
92
outliers
extreme observations
93
Fences
cutoff points for determining outliers
94
Lower fence formula
Q1 - 1.5(IQR)
95
upper fence formula
Q3 + 1.5(IQR)
96
5 number summary
no commas | Min Q1 Q2 Q3 max
97
box plot
graphical representation of data using the 5 summary steps | - y = -> box plot -> L1 ->1 -> zoom -> zoom stat
98
5 summary steps
1. draw a number line for min to max 2. draw vertical line at Q1,Q2,Q3; enclose the lines 3. draw line from min to Q1 and Max to Q3
99
scatter diagrams
graph that shows the relationship between 2 quantitative variables measured on the same individual - L1 is x; L2 is y -> y= -> scatter -> zoom -> 9
100
scatter diagram x axis
independant/explanatory/predictor
101
scatter diagram y axis
dependant/response/outcome
102
correlation
a relationship or association between 2+ variables | - linear
103
positively associated correlation
1 increases, other increases
104
negatively assoc. correlation
1 increases, other decreases
105
linear correlation coefficient (Pearson)
measure of the strength and direction of the linear relationship between 2 quantitative variables
106
r
linear correlation coefficient
107
r calculation
2nd -> 0 (catelog) -> diagnostic on -> LinReg -> 4 -> (L1, L2) -> enter -> round to 3 or 4 DP
108
properties of r
- -1 < r < 1 - R = +/-1 is a perfect linear association - The closer r is to +/-1, the stronger the linear association - If r is close to 0, there is little or no evidence of linear association (weak) - R is unitless - Not resistant
109
critical values
cutoff scores
110
__ approaches once have r and CV
- graphic | - formulaic
111
graphic
r is neg linear correlation
112
formulaic
- Find the absolute value of r If |r| > CV, a linear correlation exists - if |r| < CV, no linear correlation exists - If a linear correlation exists, note if it is pos or negative
113
n
number of coordinate pairs
114
least squares regression line
line that minimizes the sum of the squared errors (residuals)
115
equation of the LSRL
Y hat = b sub 1 x + b sub 0 Y hat = predicted value of y Bi slope B0 = y-intercept
116
residuals
error between the observed value of y and the predicted value of y calculated as residual = observed y - predicted y Y - y hat Positive = y above y hat Negative = y below y hat
117
can use LSRL to predict vlaues for _ at specific values for _
y, x
118
If _ increases by 1 unit, _ rate increases/decreases by _ on average
x; y; y hat
119
coefficient of determination
measure the proportion of total variation in the response variable that’s explained by the regression line - square r - LinReg
120
R2
coefficient of determination
121
residual plot
scatter plot with the explanatory variable on the x-axis and the residual values on the y-axis - Calculator: 2nd → stat → residuals - Make sure you can run LinReg to get the proper residuals
122
Residual plot Q1
is the linear model appropriate if sectional pattern?
123
RP Q2
is variant of residuals consisten
124
RP Q3
are there any outliers