Exam 1 Flashcards

1
Q

the science of collecting, describing, and analyzing data

A

statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

subjects/objects we obtain information about in a data set

A

cases/units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

any characteristic recorded for each case (columns in the data table)

A

variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

divides the cases into groups, placing each case into exactly one of two or more categories

A

categorical variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

measures or records a numerical quantity for each case

A

quantitative variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

helps explain or predict values of other variables

A

explanatory variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

gives the reason for a specific variable

A

response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a lurking or confounding variable?

A

a third variable that is not considered
ex: age of children not considered in the reading level/cavity data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

includes individuals or objects of interest

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

subset of the population

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

n =

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

process of using data from a sample to gain information about the population

A

statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

method of selecting a sample causes sample to differ from the population in some relevant way

A

sampling bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

each unit of a population has an equal change of being selected, regardless of the other units chose for the sample

A

simple random sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

difference between sampling bias and bias?

A

sampling bias impacts the sample
bias impacts the actual method of data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

values of one variable tend to be related to the values of another variable

A

association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how does association and cause relate?

A

association does NOT imply a cause and effect relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

changing the value of one variable influences the value of the other variable

A

causation/casually associated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

_____ implies a particular direction and relationship holds an overall trend

A

causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

a study in which the researcher actively controls one or more of the explanatory variables

A

experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

a study in which the researcher does not actively control the value of any variable but simply observes the values as they naturally exist

A

observational study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what does the word “improve” imply in a study?

A

causality, cannot happen in observational studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

a casual relationship can only be determined in what study?

A

experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

the value of the explanatory variable for each unit is determined randomly, before the response variable is measured

A

randomized experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
randomly assign cases to different treatment groups and then compare results on the response variables
randomized comparative experiment
26
each case gets both treatments in random order and examine individual differences in the response variable between 2 treatments
matched pairs experiment
27
a summary statistic that helps describe a variable
proportion
28
how to determine a proportion in a category =
number in that category / total number
29
proportion for a sample is denoted:
p-hat
30
p-hat =
proportion for a sample
31
proportion for a population is denoted:
p
32
p =
proportion for a population
33
used to show relationship between 2 categorical values
2 way table
34
an observed value that is notable distinct from the other values in a data set
outlier
35
a numerical average of the data values
mean
36
mean of a sample is denoted:
x-bar
37
x-bar =
mean of a sample
38
mean of a population is denoted:
mu
39
mu =
mean of a population
40
the middle entry of an ordered list if the list contains an off number of entries
median
41
median is denoted:
m
42
m =
median
43
a statistic that is relatively unaffected by extreme values
resistance
44
is median resistant to outliers?
yes
45
is mean resistant to outliers?
no
46
measures the spread of the data in a sample
standard deviation
47
the larger the standard deviation, the ____ variability there is in the data and the _____ spread out the data are
more more
48
standard deviation of a sample is denoted:
s
49
s =
standard deviation of a sample
50
standard deviation of a population is denoted:
σ
51
σ =
standard deviation of a population
52
what is the 95% rule?
if a distribution of data is symmetric and bell-shaped, 95% of the data should fall within 2 standard deviations from the mean
53
tells how many standard deviations the value is from the mean and is independent of the unit of measurement
z-score
54
z-score =
(x - xhat) / s
55
the value of a quantitative variable which is greater than p percent of the data
percentile
56
what is the 5 number summary?
q0 = minimum q1 = first quartile (25%) q2 = median q3 = third quartile (75%) q4 = maximum
57
range =
maximum - minimum
58
interquartile range =
q3-q1
59
is range resistant to outliers?
NO
60
is interquartile range resistant to outliers?
YES
61
is standard deviation resistant to outliers?
NO
62
the start of a box in a box plot is at
q1
63
the end of a box in a box plot is at
q3
64
the line that divides the box in a box plot is
the median
65
the lines on a box plot are
to the most extreme data value that is not an outlier
66
if the data is skewed left, median _____ mean
median greater than the mean
67
if the data is symmetric, median _____ mean
equal
68
if the data is skewed right, median _____ mean
median smaller than the mean
69
a graph of the relationship between 2 quantitative variables
scatterplot
70
for a scatterplot, the _____ variable is on the x axis and the _____ variable is on the y axis
explanatory response
71
a measure of the strength and direction of linear association between 2 quantitative variables
correlation
72
correlation of a sample denoted:
r
73
correlation of a population denoted:
ρ "rho"
74
correlations closer to 1 are _____
stronger
75
for the linear regression line equation y=bo + bi x what is y?
predicted value
76
for the linear regression line equation y=bo + bi x what is bo?
y-intercept
77
for the linear regression line equation y=bo + bi x what is bi?
slope
78
for the linear regression line equation y=bo + bi x add in where response and explanatory variables would be
response = bo+bi(explanatory)
79
difference between the observed and predicted values of the response variable
residual
80
equation for residual:
observed - predicted y - y-hat
81
what does a residual represent on a scatterplot?
vertical deviation from line to a data point
82
line that minimizes the sum of the squared residuals
least squares line
83
do outliers influence regression line?
YES
84
data from the principality of andorra were used to determine that 98.9% of andorrans have access to the Internet, the highest rate of any country. what are the cases in the data from andorra? what variable is used? is it categorical or quantitative?
cases - people in Andorra variable - internet access categorical
85
an online poll conducted on biblegateway.com asked, “how often do you talk about the bible in your normal course of conversation?” over 5000 people answered the question, and 78% of respondents chose the most frequent option: multiple times a week. can we infer that 78% of people talk about the bible multiple times a week? why or why not?
no biblical website creates bias
86
state whether the sentence implies no association between the variables, association without implying causation, or association with causation: studies show that taking a practice exam increases your score on an exam.
association w/ causation
87
state whether the sentence implies no association between the variables, association without implying causation, or association with causation: families with many cars tend to also own many television sets.
association implying causation
88
state whether the sentence implies no association between the variables, association without implying causation, or association with causation: sales are the same even with different levels of spending on advertising.
no association
89
state whether the sentence implies no association between the variables, association without implying causation, or association with causation: taking a low-dose aspirin a day reduces the risk of heart attacks.
association with causation
90
state whether the sentence implies no association between the variables, association without implying causation, or association with causation: goldfish who live in large ponds are usually larger than goldfish who live in small ponds.
association implying causation
91
state whether the sentence implies no association between the variables, association without implying causation, or association with causation: putting a goldfish into a larger pond will cause it to grow larger.
association with causation
92
a nationwide US telephone survey conducted by the pew foundation1 asked 2625 adults ages 18 and older, “some people say there is only one true love for each person. do you agree or disagree?” In addition to finding out the proportion who agree with the statement, the pew foundation also wanted to find out if the proportion who agree is different between males and females, and whether the proportion who agree is different based on level of education (no college, some college, or college degree). the survey participants were selected randomly, by landlines and cell phones. what are the cases in the survey about one true love? what are the variables? are the variables categorical or quantitative? how many rows and how many columns would the data table have?
cases - 2625 people variables: do u agree? - categorical gender - categorical level of education - categorical 2625 rows, 3 columns
93
give the notation for the mean: for a random sample of 50 seniors from a large high school, the average SAT score was 582 on the math portion of the test.
x-bar = 582
94
give the notation for the mean: about 1.67 million students in the class of 2014 took the SAT,28 and the average score overall on the math portion was 513.
mu = 513
95
the five number summary for the mammal longevity data in table 2.21 on page 73 is (1, 8, 12, 16, 40). find the range and interquartile range for this dataset.
range: 40-1 = 39 IQR: 16-8 = 8
96
use the regression line to predict the tip of a bill that is $59.33 tip = -0.292 + 0.182 (bill)
10.51
97
use the regression line to predict the tip of a bill that is $9.52 tip = -0.292 + 0.182 (bill)
$1.44
98
use the regression line to predict the tip of a bill that is $23.70 tip = -0.292 + 0.182 (bill)
$4.02