Assignment #1 Flashcards

(121 cards)

1
Q

What are the 7 steps of a researcher looking to prove a statistical association?

A
  1. Define key terms.
  2. Define population it may apply to
  3. Identify a sample within which to investigate.
  4. Measure variables for each case in the sample.
  5. Apply statistical techniques to collected data.
  6. Interpret results of statistical techniques.
  7. Attempt to explain the association between the variables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Things that have certain characteristics or properties.

A

Cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Commonality between all cases in a data set

A

Unit of analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Examples of Unit of Analysis

A

-Ex: all cases are individual people, business firms, hospitals, countries, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A characteristic or property of the case, taking on diff values for diff cases.

A

Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which two descriptors are given to variables?

A
  • Variable name

- Variable Label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Short description of a variable (usually 1 word)

A

Variable Name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Examples of variable names

A

-Ex: Gender, SRSC, hamburgers, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A slightly longer description of a variable

A

Variable Label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Examples of a Variable Label

A

-Ex: Survey Respondent’s Gender, Survey Respondent’s Self-Rated Social Class, Number of Hamburgers eaten in past 12 months

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The possible outcomes of a variable

A

Values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the two options of what a variable’s value might be?

A

named category or number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A numeric number that correspondents to a named category value for a variable

A

Value Label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Two components of a good variable

A
  • Exhaustive

- Mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When every case can attain a value for the variable

A

Exhaustive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When no case can have more than one value of a variable.

A

Mutually Exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

4 Levels of Measurement for variables.

A
  1. Nominal
  2. Ordinal
  3. Interval
  4. Ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A variable with unordered named categories

A

Nominal variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

a variable with ordered
categories and undefined distances between
values

A

Ordinal Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Example of a nominal variable

A
Marital status:
1 = married
2 = divorced
3 = widowed
4 = never married
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Example of ordinal variable

A
Self-assessed social class
1 = lower
2 = middle
3 = upper
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

a quantitative variable with defined

distances between values and an arbitrary zero

A

Interval Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Example of an interval variable

A

Temperature in Celsius (0 doesn’t mean “no temperature”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

a variable with defined distances

between values and a non-arbitrary zero

A

Ratio variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Example of ratio variable
Age in years
26
What are the two characteristics of quantitative variables?
Discrete or Continuous
27
A quantitative variable measured in whole numbers (how many?)
Discrete
28
A quantitative variable measured in infinitely divisible units (how much?)
Continuous
29
What type of variable are nominal/ordinal variables?
Categoric/Categorical/Qualitative
30
What type of variable are interval/ratio variables?
Quantitative/Numerical
31
Quantity in which measurements are reported
Unit of Measurement
32
Examples of a unit of measurement
- Metres for the variable hight - Degrees for the variable temperature in Celsius - CND Dollars for the variable monetary value
33
Variables with only two possible values.
Dichotomous, binary, dummy
34
Example of binary variables
-Employment status (0 = employed or 1 = not employed)
35
What type of multi-value variable can easily be made into a binary variable
Categorical variables
36
Taking a quantitative variable and making it categoric does what to the info presented
Less info
37
Causing variable
Independent or Explanatory Variable
38
Caused variable
Dependent or Response Variable
39
Prior event, condition or state of affairs without which the event in question would not have occurred.
Cause
40
Causes that raise the probabilities of their effects, all else being equal.
Probabilistic.
41
When do we have reason to believe that X causes Y (4)
1. There is an association between X and Y 2. X precedes Y in time 3. We have eliminated spurious causal linkages 4. We have a plausible explanatory rationale for the causal relationship
42
What are the 3 types of statistical research designs?
1. Experimental design. 2. Cross-sectional design. 3. Longitudinal design.
43
Research design where half of participants are subjected to X and half are control group.
Experimental design
44
Research design where you measure both X and Y in participants at same time.
Cross-sectional design.
45
Measure X and Y in a group of study participants at multiple points in time.
Longitudinal design
46
Multivariate causality scenarios (5)
1. Indirect or chain causal relationships 2. Spurious associations 3. Multiple causality 4. Statistical interactions 5. Suppressed relationships
47
``` X1 causally influences X2 which then causally influences Y (X1 → X2 → Y). ```
Indirect or chain causal relationships
48
What is X2 in an indirect or chain causal relationship?
Mediating or intervening variable
49
What is X1 in indirect or chain causal relationships.
Antecedent variable.
50
A third variable X2 causally influences both X1 and Y, such that an empirical association exists between X1 and Y but the association is not causal.
Spurious associations
51
X1, X2, X3, etc. have distinct effects on Y.
Multiple causality
52
X1 influences Y differently for different values of X2.
Statistical (conditional/moderating) interactions
53
X1 influences Y through distinct processes that | cancel each other out (to a degree).
Suppressed relationships
54
The entire group of cases that we want information about
Population
55
a part of the population that we | actually examine in order to gather information.
Sample
56
a numerical summary of the | population.
Parameter
57
a numerical summary of the sample | data.
Statistic
58
4 examples of parameters
- population size - mean of X in population - standard deviation of X in population - proportion in population
59
Standard notation for population size
N
60
Standard notation for sample size
n
61
standard notation for mean of X in population
μ (mew)
62
standard notation for mean of X in sample
x̄(x-bar)
63
standard notation for standard deviation of X in population
σ(sigma)
64
standard notation for standard deviation of X in sample
s
65
standard notation for proportion in population
p
66
standard notation for proportion in population
p
67
standard notation for proportion in sample
p̂(p-hat)
68
what summarizes the | information in a collection of data.
Descriptive statistical methods
69
what provides predictions about characteristics of a population based on information in a sample from that population. Inferential statistics assume that the sample is a probabilistic sample.
Inferential statistical methods
70
a sample chosen by chance.
A probabilistic sample (or random sample)
71
probabilistic sampling design that gives each member of the population an equal chance to be selected
proportionate sampling designs
72
probabilistic sampling design that assigns different probabilities to different subsets of the population
disproportionate sampling designs
73
sample of people who choose | themselves by responding to a general appeal
voluntary response sample (or convenience | sample)
74
Why are voluntary response samples biased?
people with strong opinions, especially negative | opinions, are most likely to respond.
75
Can we apply statistical inference to convenience samples
No
76
n cases from the population chosen in such a way that each case in the population has an equal chance of ending up in the sample.
A simple random sample (or SRS)
77
create a list of all N members of the population. Then calculate k = N/n (rounded up). We will select every kth member of this list, but need a random starting point. Use a random numbers table to randomly select a number c between 1 and k. Select the cth case in the list, case c + k, case c + 2k, and so forth.
systematic random sample
78
divide the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form a full sample.
a stratified random sample
79
When the proportion of the sample drawn from a given strata matches the strata’s proportion of the population, the stratified sample is
proportionate
80
When the proportions of the sample do not match the population it is called a
disproportionate stratified random sample.
81
the population is first divided into strata. A simple random sample of strata are selected and all cases within each of the selected strata are then sampled.
cluster random sampling
82
entails going through | several rounds of sampling to produce the sample.
multistage random sample
83
provides numerical summaries of | the distribution of observations
frequency table
84
provide visual summaries | of the distribution of observations (2)
pie charts and bar charts
85
a listing of possible values for a variable along with a tabulation of the number of observations for each value.
Frequency table
86
in a frequency table, ghe number of observations per value
Frequency count
87
in a frequency table, the percentage of the total number of valid observations per value
percentages
88
In a frequency table, the running total of the | percentages as the values of the variable increase
Cumulative percentages
89
a visual representation of a frequency table that contains less information but has more intuitive appeal. Each value gets a slice of the pie, and the percentage of observations for a value determines the relative size of its slice.
A pie chart
90
visually represents a frequency distribution. Each value gets a bar (with spaces between the bars), and the frequency or percentage of responses for a value determines the height of the bar.
A bar chart (or bar plot)
91
What are the three aspects of the distribution of observations for the variable:
1. Central tendency 2. Dispersion 3. Shape
92
These are different ways of describing a ‘typical’ value of a quantitative variable in a set of cases.
Central tendency
93
3 indicators of central tendency
mean, median and mode.
94
the sum of the observations divided | by the number of observations.
Mean
95
is the mean resistant to outliers
no
96
the measurement that falls in the middle of the ordered set of observations, such that half the observations are larger and half are smaller.
median
97
what is the median if the sample size is even
wo middle measurements occur, and the median is | the midpoint between (the mean of) the two.
98
Is the median resistant to outliers
yes
99
the value that occurs most frequently.
mode
100
how dispersed or spread out | the observations for a quantitative variable are,
dispersion
101
3 measures of dispersion
1. Range 2. Interquartile Range 3. Standard deviation
102
the difference between the largest | and smallest observations.
Range
103
he | difference between the lower and upper quartiles.
Interquartile Range (IQR)
104
Which percentile/quartile is the median
50% percentile, second quartile.
105
Which percentile is the lower quartile
25th percentile
106
Which percentile is the upper quartile
75th percentile
107
How to find quartiles:
1. Arrange the observations in increasing order and locate the median. 2. The lower quartile is the median of the observations whose position in the ordered list is to the left of (below) the location of the overall median. 3. The upper quartile is the median of the observations whose position in the ordered list is to the right of (above) the location of the overall median.
108
Is IQR resistant to outliers
yes
109
Characteristics of the standard deviation (4)
1. It essentially represents an ‘average’ or ‘typical’ distance of an observation from the mean. 2. It is always greater than or equal to zero, and it equals zero only when the observations are all the same. 3. The greater the variation about the mean, the larger the value of the standard deviation. 4. It is not resistant to outliers.
110
tool for looking at shape: divides the range of possible values into intervals of equal width and then presents each interval as a bar with height equal to the number of responses that fall within the interval. There are no spaces between the bars.
Histogram
111
What does it mean if a histogram is skewed to the right?
there’s a skinny tail sticking | out towards the right side)
112
What does it mean if a distribution is bell-shaped
Normal distribution
113
what tool to see shape is essentially a histogram that has | been smoothed out.
A density plot
114
a visual summary of a set of observations containing the maximum and minimum observations, the lower and upper quartiles, the median and the IQR.
The boxplot (or box-and-whisker plot)
115
What does the central box of a box plot contain
he central 50% of the distribution of values, those from the lower quartile to the upper quartile (the interquartile range).
116
What makes an observation in a box plot an "outlier"
hen it falls more than 1.5 IQRs above the upper quartile or more than 1.5 IQRs below the lower quartile.
117
How are outliers represented on box plots
dots, extend past min/max value.
118
How to make a stem plot
1. Separate each observation into a stem, consisting of all but the final (rightmost) digit, and a leaf, the final digit. Stems may have as many digits as needed, but each leaf contains only a single digit. 2. Write the stems in a vertical column with the smallest at the top, and draw a vertical line at the right of this column. 3. Write each leaf in the row to the right of its stem, in increasing order out from the stem.
119
What does adding a constant do to a quantitative variable
changes the central | tendency but not the dispersion or shape
120
What does standardizing a quantitative variable do
changes the central tendency (to mean = 0) and the dispersion (to standard deviation = 1) but not the shape
121
What does square rooting, logging or squaring a quantitative variable do
typically changes | the central tendency, dispersion and shape