Exam 1 - Sept 20 Flashcards

(82 cards)

1
Q

What are the four steps of the experimental process?

A

Formulate Theory → Collect Data → Summarize Results → Interpret Results and Make Decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variable

A

An observed category (label) or quantity (number) in an experiment that may “vary” for different individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categorical variable

A

Individuals are classified into groups or categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Quantitative variable

A

A numerical quantity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explanatory variable

A

Variable that is thought to affect (“explain”) another variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Response Variable

A

Variable that is thought to be affected by (“respond to”) the explanatory variable(s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Inference

A

A conclusion that patterns from data can be extended to some broader context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Statistical Inference

A

Justified by a probability model linking the data to the broader context; Incorporates measure of uncertainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Causal Inference

A

Enables us to establish a cause and effect relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Population Inference

A

About population characteristics, Expand results from study to larger population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the probability model of randomization. What kind of inferences can be made when it is used?

A

Assigning experimental units (subjects) to treatment groups using a chance mechanism
Causal inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the probability model of random sampling. What kind of inferences can be made when it is used?

A

Selecting experimental units (subjects) to be in a sample using a chance mechanism
Population inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Anecdotal Evidence

A

A short story or example of an interesting event that could lead to scientific investigation, but does not establish a scientific theory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Observational Study

A

A study in which the group status (e.g., gender) is beyond the control of the researcher; results may be due to confounding variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Randomized Experiments

A

An experiment in which randomization is done to assign subjects to groups; accounts for confounding variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Main Lesson for Causal Inferences

A

causal inferences can be made from randomized experiments, but not observational studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Confounding Variables

A

variables that are related to both the group membership and the outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Main Lesson for Population Inferences

A

population inferences can only be made from samples which utilize random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Population

A

A well-defined collection of objects that we are interested in drawing conclusions about

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Sample

A

A subset of objects from the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Describe the two types of random sampling

A

Simple Random Sample (SRS) → All individuals have an equal chance of being selected

Stratified Random Sample → Individuals selected within groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Self-selection

A

sampling using volunteers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Convenience sampling

A

more common but allows for a higher probability of bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Control Groups

A

Gives a baseline for comparison with test groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Placebo Effect
Individuals may respond favorably even when given a treatment that is known to be ineffective, opposite is nocebo effect
26
Blinding
The treatment assignment is kept secret from the experimental subject
27
Double Blinding
The treatment assignment is kept secret from both the experimental subject and the individuals measuring the response
28
Sampling Error
Discrepancy between the sample and population
29
Nonresponse bias
Not everyone who is asked to participate agrees to do so, and nonresponders differ from responders
30
What are some ways to display categorical variables in graphic form?
Bar plots and pie charts
31
Give a general description of a histogram
The range of observations is divided into subintervals (usually of equal size) The frequency of observations is plotted as a bar on the y-axis
32
What three aspects of the data are shown by histograms?
Center, Outliers, and General Shape
33
What would data look like that is symmetric or left/right skewed?
Symmetric or skewed - shape of the distribution Both halves are a reflection of each other Can be left or right skewed One side has a tail (named side), one side has the bulk of the data
34
Unimodal/Multimodal
number of peaks in the distribution
35
What is a quartile?
The 25th and 75th percentiles are the first (Q1) and third quartiles (Q3)
36
How do you make a box plot?
The median of the observations is denoted by a thick line A box is drawn from the Q1 to the Q3 Whiskers extend to the largest and smallest observation Outliers are shown as stars
37
What is a five-star summary?
The set of numbers that make up the → minimum, Q1, median, Q3, maximum
38
Observations
The categorical or quantitative measurements made (data)
39
Frequency
A count of observations that fall into a certain category
40
Statistic (general)
A numerical measure calculated from the observations; sample characteristic
41
(2) measures of center
mean or median
42
(3) measures of spread
variance, standard deviation, IQR
43
What is the symbol for mean? What is its strength/weakness?
y with a horizontal line over it efficient in using all data
44
What is the symbol for median? What is its strength/weakness?
M - population median m (italics) - sample median resistant to outliers
45
Percentile
The pth percentile of the observations is the observation value such that p% of the observations are smaller than it
46
IQR or Interquartile Range
Q3 - Q1 | Measures dispersion
47
What is the symbol for variance?
σ^2 - population variance | s^2 (italics) - sample variance
48
Standard Deviation (formula, will not need to calculate) Why is SD better than Variance?
the square root of .......... 1/(n-1) times the sum of the squared differences between each value and the mean (The average distance of each value from the mean) same units as the data, variance is squared
49
What is the symbol for standard deviation?
σ - population standard deviation | s (italics) - sample standard deviation
50
How is an 'outlier' defined?
An observation is considered an outlier if it is smaller than Q1 - 1.5(IQR) or larger than Q3 + 1.5(IQR)
51
Parameter
population characteristic
52
(Box-plots) What is the meaning of long-tailed or short-tailed?
Long-Tailed → Spike in data | Short-Tailed → Data evenly spread
53
What are the proper graphs (2) to show the relationship between two categorical variables?
Frequency or Relative Frequency Table Row percentages displayed, each cell is the count for that cell divided by the row total Stacked Relative Frequency Bar Chart Percent within levels of ____
54
What are the proper graphs (2) to show the relationship between a quantitative and a categorical variable?
Side by Side Box Plots | Side by Side Dotplots
55
What is the proper graph to show the relationship between two quantitative variables?
Explanatory variable on x-axis and response on the y-axis
56
What is the standard notation for a normal distribution?
Y ~ N(μ, σ) μ is mean σ is SD
57
How can the mean and the SD affect the appearance of a graph of normal distribution?
Mean (μ) → Determines the center | SD (σ) → Determines the spread or height/width
58
What does it mean to standardize a data point with respect to the normal curve?
Rescaling each normally distributed variable to make them equivalent with respect to the area under the curve
59
What is the equation to standardize a data point with respect to the normal curve?
Subtract the mean and divide by the standard deviation to yield # of SDs from the mean (Z)
60
Using a normal distribution table, how can you convert from a data point to the proportion of data above or below that point?
Convert to Z value The exact Z is the value on the leftmost column plus the value on the topmost row → Area/Proportion below Z = table value → Area/Proportion above Z = 1 - (table value)
61
Using a normal distribution table, how can you convert two data points to the proportion of data between those points?
Convert to Z value | → Area/Proportion between ZA and ZB = table value B - table value A
62
Using a normal distribution table, how can you convert a percentile to the corresponding cutoff point?
Convert to Z by finding proportion in table then the corresponding Z-value Convert Z-value back to Y using the standardization equation
63
What are the four ways to assess the normality of data?
Histogram, Normal Curve, Probability Tables, Normality Tests
64
How do you assess normality using a histogram?
Plot the data into a histogram and superimpose a normal curve
65
How do you assess normality using a normal curve?
Compare data with 68-95-99.7 rules
66
How do you assess normality using probability tables?
Comparison of observed versus expected left tail percentages
67
How do you assess normality using the Shapiro-Wilk test?
Yields a p-value, above .1 is no evidence for non-normality
68
Sampling Variability
Variability among random samples from the same population
69
Sampling Distribution
A probability distribution that characterizes some aspect of sampling variability
70
Cutoff for CLT
A sample size over 30 allows for the use of the CLT (Central Limit Theorem)
71
Standard Error (defn and formula)
The uncertainty in the mean of the sample data due to sampling characteristics, equal to the SD of X-bar σ (or s) over √n
72
Bias
Estimates are systematically away from center, reduced by random sampling
73
Variability
Spread of estimates, reduced by increasing sample size
74
Confidence Level
The percentage of samples that will produce confidence intervals containing μ
75
Margin of Error (MOE)
Half the width of the confidence interval, equal to t(alpha/2, n-1) * s/√n
76
Critical Value
The normal tail probability corresponding to Z𝞪/2 The z-value corresponding to the cutoffs for the confidence interval, can be converted to Y to find the values for the confidence interval
77
What is the notation for a normal curve created for a sample mean (SD known)?
X-bar ~ Normal(μ, σ/√n)
78
How do you find the confidence interval for a population mean calculated from sample means when SD is known?
100(1-𝞪)% → Zalpha/2 → Critical Value = upper bound on confidence interval (if +) Mean +/- Critical Value* Standard Error (standard deviation/sample size) = Confidence Intervals
79
How do you find the confidence interval for a population mean calculated from sample means using only estimated components?
X-bar +/- t(alpha/2, n-1) * s/√n X-bar is sample mean, s is the sample standard deviation, n is the sample size t(alpha/2, n-1) is the critical value of Student’s t-distribution with n-1 degrees of freedom for tail probability 𝞪/2
80
How do you calculate required sample size for a 95% confidence interval using sample standard deviation and desired margin of error?
Margin of Error depends on 𝞪 and n, if 𝞪 is .05 then t(.025,n-1)=2 and the number of samples (n) is equal to (2s/MOE) squared Plug in desired MOE and sample s to get recommended n, then round up Or solve for t(alpha/2, n-1) * s/√n = MOE with an estimated t-value* *same thing, different equation
81
What are the assumptions when creating a one-sample confidence interval ?
Data must be regarded as a random sample from a large population Observations must be independent of each other If n is small, the population distribution must be approximately normal
82
What measure of spread is resistant to outliers?
IQR