chapter 1 Flashcards

1
Q

Collections of observations, such as measurements, genders, or survey responses

A

data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The science of planning studies and experiments, obtaining data, and organizing, summarizing, presenting, analyzing, and interpreting those data and then drawing conclusions based on them.

A

statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The complete collection of allmeasurements or data that are being considered. Typically, a ___ is the complete collection of data that we would like to make inferences about.

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

parameter…

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The collection of data from everymember of a population

A

census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A subcollectionof members selected from a population

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In the journal article “Residential Carbon Monoxide Detector Failure Rates in the United States”, it was stated that there are 38 million carbon monoxide detectors installed in the United States. When 30 of them were randomly selected and tested, it was found that 12 of them failed to provide an alarm in hazardous carbon monoxide conditions.

A

Population:All 38 million carbon monoxide detectors in the United States
•Sample:The 30 carbon monoxide detectors that were selected and tested

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

is one in which the respondents themselves decide whether to be included.

A

Voluntary Response Sampleor Self-Selected Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Internet polls, in which people online can decide whether to respond•Mail-in polls, in which people can decide whether to reply•Telephone call-in polls, in which newspaper, radio, or television announcements ask that you voluntarily call a special number to register your opinion

A

voluntary response sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

It is possible that some treatment or finding is effective, but common sense might suggest that the treatment or finding does not make enough of a difference to justify its use or to be practical.

A

practical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

is achieved in a study if the likelihood of an event occurring by chance is 5% or less.

A

statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When forming a conclusion based on a statistical analysis, we should make statements that are clear even to those who have no understanding of statistics and its terminology.

A

misleading conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If survey results are not worded carefully, the results of a study can be misleading.

A

loaded question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sometimes survey questions are unintentionally loaded by the order of the items being considered.

A

order of questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

occurs when someone either refuses to respond or is unavailable.

A

nonresponse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A major use of statistics is to collect and use sample data to make conclusions about populations.

A

key concept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

a numerical measurement describing some characteristic of a population

A

parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

a numerical measurement describing some characteristic of a sample

A

statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

consists of numbers representing counts or measurements.

A

quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

consists of names or labels (not numbers that represent counts or measurements).

A

categorical (qualitative data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Quantitative data can be further described by distinguishing between ___ and ___ types.

A

discrete and continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

result when the data values are quantitative and the number of values is finite, or “countable.”

A

discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

result from infinitely many possible quantitative values, where the collection of values is not countable.

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Another way of classifying data is to use four levels of measurement: nominal, ordinal, interval, and ratio.

A

levels of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
characterized by data that consist of names, labels, or categories only, and the data cannot be arranged in some order (such as low to high). Example: Survey responses of yes, no, and undecided
nominal
26
involves data that can be arranged in some order, but differences (obtained by subtraction) between data values either cannot be determined or are meaningless. Example: Course grades A, B, C, D, or F
ordinal
27
involves data that can be arranged in order, and the differences between data values can be found and are meaningful. However, there is no natural zero starting point at which none of the quantity is present. Example: Years 1000, 2000, 1776, and 1492
interval
28
data can be arranged in order, differences can be found and are meaningful, and there is a natural zero starting point (where zero indicates that none of the quantity is present). Differences and ratios are both meaningful. Example: Class times of 50 minutes and 100 minutes
ratio
29
* Nominal - categories only * Ordinal - categories with some order * Interval - differences but no natural zero point * Ratio- differences and a natural zero point
summary :)
30
categories only
nominal
31
categories with some order
ordinal
32
differences but no natural zero point
interval
33
difference and a natural zero point
ratio
34
apply some treatmentand then proceed to observe its effects on the individuals. (The individuals in experiments are called experimental units, and they are often called subjects when they are people.)
experiment
35
observing and measuring specific characteristics without attempting to modify the individuals being studied
observational
36
is the repetition of an experiment on more than one individual.
replication | **Good use of replication requires sample sizes that are large enough so that we can see effects of treatments.
37
a technique in which the subject doesn’t know whether he or she is receiving a treatment or a placebo.
blinding ***Blinding is a way to get around the placebo effect, which occurs when an untreated subject reports an improvement in symptoms.
38
Blinding occurs at two levels:1.The subject doesn’t know whether he or she is receiving the treatment or a placebo.2.The experimenter does not know whether he or she is administering the treatment or placebo.
double blind
39
is used when subjects are assigned to different groups through a process of random selection. The logic is to use chance as a way to create two groups that are similar.
randomization
40
Aftercompleting our preparation by considering the context, source, and sampling method, we begin to
analyze
41
The final step in our statistical process involves conclusions, and we should develop an ability to distinguish between statistical significance and practical significance.
conclude
42
When collecting data from people, it is better to take measurements yourself instead of asking subjects to report results.
Sample Data Reported Instead of Measured
43
the weight of supermodels
quantitative data
44
the age of respondents
quantitative data
45
the gender of professional athletes
categorical data
46
names of movies
categorical
47
models of vehicles
categorical
48
number of tosses of a coin before getting tails
discrete data
49
the lengths of distances fro 0cm to 12cm
continuous
50
all measurements are
continuous
51
survey responses of yes, no, undecided
nominal level
52
courses grades A,B,C, D or F
ordinal level
53
years- 1000,2000, 1776, 1492
interval level
54
class time of 50 minutes or 100 minutes
ratio level
55
In a Harris Interactive survey of 2276 adults in the US, it was found that 33% of those surveys never travel using commercial airlines. Identify the population and sample. Is the value of 33% a statistic or parameter?
sample = 2276 population = adults in the US 33%=statistic
56
cigarette brands
categorical data
57
colors of m&ms
categorical data
58
weights of m&ms
quantitative
59
number if people surveyed in each of the next several years for the national health and nutrition examination surveys
discrete
60
exact foot lengths (measured in cm) of a random sample of statistic students
continuous
61
exact times that randomly selected drivers spend texting while driving during the past 7 days
continuous
62
in a survey of 1020 adults in the US 44% said that they wash their hands after riding public transportation
sample=1020 population= adults in the US 44%=statistic
63
apply some treatmentand then proceed to observe its effects on the individuals.
experiment
64
observing and measuring specific characteristics without attempting to modify the individuals being studied
observational
65
is the repetition of an experiment on more than one individual.
replication
66
Good use of replication requires sample sizes that are large enough so that we can see effects of treatments.
:)
67
is a technique in which the subject doesn’t know whether he or she is receiving a treatment or a placebo.
blinding
68
1.The subject doesn’t know whether he or she is receiving the treatment or a placebo.2.The experimenter does not know whether he or she is administering the treatment or placebo.
double blind
69
is used when subjects are assigned to different groups through a process of random selection. The logic is to use chance as a way to create two groups that are similar.
randomization
70
A sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen.
simple random
71
Select some starting point and then select every kth element in the population.
systematic sampling
72
Use data that are very easy to get.
convience sampling
73
Subdivide the population into at least two different subgroups (or strata) so that the subjects within the same subgroup share the same characteristics. Then draw a random sample from each subgroup (or stratum).
stratified sampling
74
Divide the population into sections (or clusters), then randomly select some of those clusters, and choose allthe members from those selected clusters.
cluster sampling
75
Observe and measure, but do not modify.
observational studies
76
Data are observed, measured, and collected at one point in time, not over a period of time.
cross-sectional
77
Data are collected from a past time period by going back in time (through examination of records, interviews, and so on).
retrospective
78
Data are collected in the future from groups sharing common factors (called cohorts). Collection of data over a long period of time.
prospective
79
occurs in an experiment when the experimenter is not able to distinguish between the effects of different factors.
confounding
80
statistic or parameter In a study of flights from JFK to NY to LAX 48 flights are randomly selected & the average(mean) arrival time is 8.9 min late
statistic
81
statistic or parameter | A recent California Health Interview Survey(CHIS) included 2799 adolescent residents of California
statistic
82
statistic or parameter | a deadly disaster in the US was the Triangle Shirtwaist Factory Fire in NYC - 146 garment workers died
parameter
83
statistic or parameter in a study of 400 babies born at 4 different hospitals in NY state, it was found that the average weight was 3152.0 grams
statistic
84
statistic or parameter | in the same study 51% of the babies were girls
statistic
85
statistic or parameter | a study was conducted of all 2223 passengers aboard the Titanic when it sank
parameter
86
statistic or paramete | the average atomic weight of all elements in the periodic table is 134.355 unified atomic mass units
parameter
87
discrete or continuous in a study of weight gains by college students in their freshman year, researched record the amounts of weight gained by randomly selected students
continuous
88
discrete or continuous among the subjects surveyed as part of the California health interview survey, several subjects are randomly selected and their heights are recorded
continuous
89
discrete or continuous in a study of service times at McDonald's drive-up window, the numbers of cars serviced each hour of several days are recorded
discrete
90
discrete or continuous | the clerk of the US of House Representatives records the number if representatives present at each session
discrete
91
discrete or continuous | a shift manager records the numbers of corvettes manufactures during each day of production
discrete
92
discrete or continuous studying relationships between lengths of feet and heights so that footprint evidence at a crime scene can be used to estimate the heights of the suspect, a researched records the exact lengths of feet from large sample of random subjects
continuous
93
``` discrete or continuous students in a statistics class record the exact lengths of ties that they superstitiously use their smartphones during class ```
continuous
94
discrete or continuous the insurance institute for highway safety collects data consisting of the numbers of Moto vehicle fatalities caused by driving while texting
discrete
95
US News periodically provides its rankings of national universities, and in a recent year the ranks for Princeton, Harvard, and Yale were 1,2,3
ordinal
96
for the presidential election of 2016. ABC conducts an exit poll in which voters are asked to identify the political party (Democrat, Republican, and so on)
nominal
97
colors of m&ms (red, orange, yellow, brown, blue, green) listed in m&ms weights
nominal
98
in a study of fast food service tines, a researcher records the time intervals of drive-up costumers beginning when they place their order and ending when they receive their order
ratio
99
Bill James records the years in which the baseball World Series is won by a team from the National League
interval
100
the author rated the movie Star Wars with 5 stars on a scale of 5 starts
ordinal
101
blood lead levels of low, medium, and high used to describe the subjects in IQ and Lead
ordinal
102
body temperatures (in degrees Fahrenheit) listed in "Body Temps"
interval
103
The defensive players were jerseys numbered, 31, 11, 22, 52, 64, 79, 81
nominal
104
part of a project in a statistic class students report their last four digits of their social security number, and the average is 4.7
nominal
105
Cormorant bird population densities were studied by using the "line transect method" with aircraft observers flying along the shoreline of Lake Huron & collecting sample data at intervals of every 20 km
systematic
106
the sexuality of women was discussed in Shere Hite's book Women & Love: A Cultural Revolution. Her conclusions were based on sample data that consisted of 4500 mailed responses from 100,000 questionnaires that were sent to women
convenience
107
In a Kelton research poll, 1114 Americans 18 years of age or older were called after their phone numbers were randomly generated by a computer & 36% of them said that they believe in UFO's
random
108
the author surveyed a sample from the population of his statistics class by identifying groups of males & females then randomly selecting 5 students from each group
stratified
109
a student of the author conducted a survey on driving habits by randomly selecting 3 different classes & surveying all of them as they left those classes
cluster
110
in a study of treatments for back pain, 641 subjects were randomly assigned to the 4 different treatment groups individualized acupuncture, standardized acupuncture, simulated acupuncture & usual care
random
111
the author collected sample data by randomly selecting 5 books from each of the categories of science, fiction, & history. the number of pages in the books were then identified
stratified
112
Satellites are used to collect sample data for estimating deforestation rates.
systematic
113
In a clinical trial of the cholesterol drug Lipitor subjects were partitioned into groups given a placebo or Lipitor doses of 10, 20, 40 or 80 mg. the subjects were randomly assigned to the different treatment groups
random
114
During the last presidential election, CNN conducted an exit poll in which specific polling stations were randomly selected & all voters were surveyed as they left
cluster
115
In 1936, Literary Digest magazine mailed questionnaires to 10 million people & obtained 2,266,566 responses. The responses indicated that Alf Landon would win the presidential election but he didn't
convenience
116
The New York state department of transportation evaluated the quality of the new rock thruway by testing core samples collected at regular intervals of 1 mile
systematic
117
often helpful in organizing and summarizing data, helps us to understand the nature of the distribution of a data set.
frequency distribution
118
Shows how data are partitioned among several categories (or classes) by listing the categories along with the number (frequency) of data values in each of them.
frequency distribution
119
The smallest numbers that can belong to each of the different classes
lower class limits
120
The largest numbers that can belong to each of the different classes
upper class limits
121
The numbers used to separate the classes, but without the gaps created by class limits
class boundaries
122
The values in the middle of the classes. Each class midpoint can be found by adding the lower class limit to the upper class limit and dividing the sum by 2.
class midpoint
123
The difference between two consecutive lower class limits in a frequency distribution
class width
124
Each class frequency is replaced by a relative frequency (or proportion) or a percentage.
relative frequency distribution
125
The frequency for each class is the sum of the frequencies for that class and all previous classes.
cumulative frequency distribution
126
1. The frequencies start low, then increase to one or two high frequencies, and then decrease to a low frequency. 2. The distribution is approximately symmetric
normal distribution
127
Combining two or more relative frequency distributions in one table makes comparisons of data much easier.
comparisons
128
A graph consisting of bars of equal width drawn adjacent to each other (unless there are gaps in the data) The horizontal scale represents classes of quantitative data values, and the vertical scale represents frequencies. The heights of the bars correspond to frequency values.
histogram
129
important uses of histogram
Visually displays the shape of the distributionof the data •Shows the location of the centerof the data •Shows the spreadof the data •Identifies outliers
130
A distribution of data is skewed if it is not symmetric and extends more to one side than to the other.
skewness
131
The population distribution is notnormal if the normal quantile plot has either or both of these two conditions: –The points do not lie reasonably close to a straight-line pattern .–The points show some systematic pattern that is not a straight-line pattern.
not a normal distribution
132
The pattern of the points in the normal quantile plot is reasonably close to a straight line, and the points do not show some systematic pattern that is not a straight-line pattern.
normal distribution
133
A graph of quantitativedata in which each data value is plotted as a point (or dot) above a horizontal scale of values. Dots representing equal values are stacked.
dot plot
134
Displays the shape of distribution of data. | It is usually possible to recreate the original list of data values.
features of dot plot
135
Represents quantitativedata by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit).
stemplots
136
a bar graph for categorical data, with the added stipulation that the bars are arranged in descending orderaccording to frequencies, so the bars decrease in height from left to right.
Pareto chart
137
A very common graph that depicts categorical data as slices of a circle, in which the size of each slice is proportional to the frequency count for the category
pie charts
138
A common deceptive graph involves using a vertical scale that starts at some value greater than zero to exaggerate differences between groups.
Nonzero Vertical Axis
139
graphs that deceive
pictographs
140
exists between two variables when the values of one variable are somehow associated with the values of the other variable.
correlation
141
exists between two variables when there is a correlation and the plotted points of paired data result in a pattern that can be approximated by a straight line.
correlation
142
The linear correlation coefficient is denoted by r, and it measures the strength of the linear association between two variables.
:)
143
is the probability of getting paired sample data with a linear correlation coefficient r that is at least as extreme as the one obtained from the paired sample data.
p-value
144
Only a small P-value, such as 0.05 or less (or a 5% chance or less), suggests that the sample results are notlikely to occur by chance when there is no linear correlation, so a small P-value supports a conclusion that there is a linear correlation between the two variables.
:)
145
regression line
y=a+bx
146
When testing a new​ treatment, what is the difference between statistical significance and practical​ significance? Can a treatment have statistical​ significance, but not practical​ significance?
Statistical significance is achieved when the result is very unlikely to occur by chance. Practical significance is related to whether common sense suggests that the treatment makes enough of a difference to justify its use. It is possible for a treatment to have statistical​ significance, but not practical significance
147
A certain medical organization tends to oppose the use of meat and dairy products in our​ diets, and that organization has received hundreds of thousands of dollars in funding from an animal rights foundation.
There does appear to be a potential to create a bias. There is an incentive to produce results that are in line with the​ organization's creed and that of its funders.
148
Determine whether the source given below has the potential to create a bias in a statistical study. Washington University obtained word counts from the most popular novels of the past five years.
There does not appear to be a potential to create a bias. The organization would not gain from putting a spin on the results.
149
In a survey of 745745 ​subjects, each was asked how often he or she played sports.played sports. The survey subjects were internet users who responded to a question that was posted on a news website.
It is flawed because it is a voluntary response sample.
150
Determine whether the results below appear to have statistical​ significance, and also determine whether the results have practical significance. In a study of a weight loss​ program, 44 subjects lost an average of 4343 lbs. It is found that there is about a 2121​% chance of getting such results with a diet that has no effect. Does the weight loss program have statistical​ significance? Does the weight loss program have practical​ significance?
​No, the program is not statistically significant because the results are likely to occur by chance. yes, the program is practically significant because the amount of lost weight is large enough to be considered practically significant.
151
Which of the following is typically the least important factor to consider when conducting a statistical analysis of​ data?
formula calculation
152
What does it mean for the findings of a statistical analysis of data to be statistically​ significant?
The likelihood of getting these results by chance is very small.
153
In a study of all 4736 students at a college comma it is found that 45 % own a computer.In a study of all 4736 students at a college, it is found that 45% own a computer.
Parameter because the value is a numerical measurement describing a characteristic of a population.
154
State whether the data described below are discrete or​ continuous, and explain why. The populations of citiesThe populations of cities
The data are discretediscrete because the data can only take onthe data can only take on specific valuesspecific values.
155
Determine whether the given value is a statistic or a parameter. A homeowner measured the voltage supplied to his home on all 30 days of a given monthon all 30 days of a given month​, and the average​ (mean) value is 115.6115.6 volts.
The given value is a parameter for the monthmonth because the data collected represent a population.
156
A particular country has 5555 total states. If the areas ofof 3535 states are added and the sum is divided by 3535​, the result is 184 comma 870184,870 square kilometers. Determine whether this result is a statistic or a parameter.
The result is a statistic because it describes some characteristic of a sample.
157
In this section we use r to denote the value of the linear correlation coefficient. Why do we refer to this correlation coefficient as being​ linear?
The term linear refers to a straight​ line, and r measures how well a scatterplot fits a​ straight-line pattern.
158
What is a scatterplot and how does it help​ us?
A scatterplot is a graph of paired​ (x, y) quantitative data. It provides a visual image of the data plotted as​ points, which helps show any patterns in the data