Text Ch.14+15+16 Flashcards

(51 cards)

1
Q

Linking Statistics to Arguments in Political
Science Research

Common arguments/claims, Examples, and Some Questions to Ask

A
  1. Descriptive claims
    - E.g., percentages, frequencies, averages
    - Are the sample data representative of the population?
  2. Claims of differences between groups
    - E.g., young versus old E.g., experimental group versus control group
    - How large are the differences?
    - Are the differences due to chance?
  3. Claims of relationships between variables
    - E.g., correlations
    - How strong is the relationship?
    - Is the relationship due to chance?
    - Is it a causal relationship?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Descriptive Statistics

A

Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures.
• Measures of central tendency
• Measures of dispersion/variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Central Tendency of each

Nominal

Ordinal

Interval

A

Nominal: Mode

Ordinal: Mode, Median

Interval :Mode, Median, Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mode

A
The mode is the value that occurs
most frequently.
• Not every sample has a distinct
mode. Sometimes it is bimodal (two
modes) or multimodal (three or more
modes) or sometimes there is no
mode at all.
• The mode is the only measure of
central tendency we can use for
nominal data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median

A
The sample median is the
middle value when the original
data values are arranged in
order of increasing (or
decreasing) magnitude. If there
isn’t one value in the middle we
take the average of the two
middle values.
The median is not affected by
extreme values.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mean

A

The sample mean is the mathematical average of the data and is the
measure of central tendency we use most often.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Standard Deviation

A
Deviation: how far an individual
score is from the mean
• Standard deviation: on average,
how far scores are from the
mean
• Sensitivity to extreme scores
  • Always positive.
  • Always in the same units as the observations in the sample.
  • Affected by outliers.
  • Affected by sample size.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variation Ratio

A

• Only option available for nominal level data; can also be used for ordinal and interval level data
• Reflects number of cases that are NOT in the modal category
• Higher ratio: suggests data are more dispersed à mode may be less representative of the data
• Lower ratio: suggests data are less dispersed à mode may be more representative of the data
• Calculated as:
Variation ratio = 1 (number of cases in modal category/number of total cases)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Range

A

Range
• Can be used for ordinal and interval level data
• Represents the difference between the highest and lowest scores (two extreme values)

Hypothetical example A:
Highest grade : 94%
Lowest grade: 10%
Range = 94-10 = 84

Hypothetical example B:
Highest grade : 70%
Lowest grade: 50%
Range =70-50 = 20

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Null and Alternative Hypothesis

A
Null hypothesis (H0)
• Relationship observed in data is due to chance
• “Nothing going on” 

Alternative hypothesis (Ha)
• Relationship observed in data is not due to chance
• “Something going on”
• Burden of proof is on Ha: must gather evidence
• Reject/fail to reject H0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe type 1 and type 2 errors

A
type 1:false positive
-Relationship claimed by researcher (“reject H0”) but the relationship doesnt exist
•Lower confidence levels
(e.g., 90%, 95%) make it
easier to reject the null,
thus making Type I
errors more likely
type 2:false negative
-No relationship claimed by researcher (“fail to reject H0”) but relationship exists
• Higher confidence levels
(e.g., 99%, 99.5%) make
it harder to reject the
null, thus making Type II
errors more likely
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

SPSS

A

computer software for finding significance of statistics

-but researcher chooses confidence level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Measures of Dispersion(3) Vs. Measures of Central Tendency(3)

A

Mode,Median, Mean

Variation Ration, Range, Standard Deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Confidence level (aka alpha level)

A
• probability that the sample
statistic is an accurate estimate
of the population parameter,
and the population parameter
lies within an estimated range of
values (known as the confidence
interval)
• E.g., If the sample statistic is 45%
and the confidence interval is +/- 3%, then the confidence interval is
42% to 48%
-confidence interval is the range within which the sample statistic should lie if is accurately representing the population parameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

higher confidence level=_____ range

lower confidence level=_____ range

A

higher

lower

Higher confidence level (99%)

  • wider confidence interval
  • more accuracy (more likely to be correct), less precision

Lower confidence level (95%)

  • narrower confidence interval
  • less accuracy (less likely to be
    correct) , more precision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Match Lower and Higher confidence levels with either the type 1 error or type 2

A
Lower confidence levels=Type 1 error
-made it easier to reject the null so a greater possibility of a false positive
• Lower confidence levels
(e.g., 90%, 95%) make it
easier to reject the null,
thus making Type I
errors more likely
Higher confidence level=Type 2 error
-made it harder to reject null and
• Higher confidence levels
(e.g., 99%, 99.5%) make
it harder to reject the
null, thus making Type II
errors more likely
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Confidence level reflects _________

A

sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Low confidence level and large sample size = type __ error

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

High confidence level and small sample size= type__ error

A

2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Statistical Significance

A
  • Statistical significance as a yes/ no question; cannot be ‘more’ or ‘less’ significant
  • Selection of confidence level affects likelihood of something being found statistically significant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

P-hacking

A

P-hacking= play with models in order to just find statistically significant findings(dropping
variables to obtain significant results)

22
Q

Harking

A

‘harking’ (hypothesizing after results are

known)

23
Q

p values= ______ values

24
Q

substantive signnificance

A
- relationship or a statistic is
substantively significant if it is
theoretically important, if it plays
a role in elaborating, modifying, or
rejecting your theory

.need for things to be meaningful

  • does it have a real impact on theory? IT SHOULD
  • if statistics do not meet the confidence criteria but are included anyway it is because the researcher thinks there is a significant finding there despite is being rejected
25
T OR F Some substantively significant findings are not statistically significant
T
26
confounding variable
``` Confounding variable: extraneous variable that affects both of the correlated variables and makes it seem like there is a relationship between them. ```
27
scatterplot(independent and dependent variables) for interval level data x-axis is _____ y-axis is _____
independent dependent
28
Positive correlation:
IV and DV change in same direction (e.g., | increase on IV corresponds to increase on DV)
29
Negative correlation:
IV and DV change in opposite directions (e.g., | increase on IV corresponds to decrease on DV)
30
Contingency Table: Indepdentand Dependent Variables for non-interval data Row:_____ olumn:_____
IV in column DV in row
31
Considerations on Bivariate Relationships(4)
1. Is there a relationship? 2. What is the direction of the relationship? (ordinal and interval level variables only) 3. What is the strength of the relationship? 4. Is the relationship statistically significant? (inferential statistics)
32
Perfect correlation
Perfect Correlation: knowing the value on one variable always lets us know the value on the other
33
Interpreting Measures of Association Nominal: Ordinal and Interval:
Nominal Level: Range 0 to 1 • 0 = no relationship • 1 = perfect relationship ``` Ordinal and Interval Level: Range -1 to +1 • -1 = perfect negative relationship • 0 = no relationship • +1 = perfect positive relationship -/+ indicates direction, not strength ```
34
___ or higher is usually a ‘strong’ relationship
0.5 or higher is usually a ‘strong’ relationship
35
Researcher must select appropriate measure of association to use for Nominal, Ordinal, and interval based onlowest level involved Nominal: Ordinal: Interval:
1.Nominal level - nominal-nominal - nominal-ordinal - nominal-interval .Measures: Cramer’s V or Lambda -Cramer’s V tends to overestimate strength -Lambda can underestimate strength 2.Ordinal level -ordinal-ordinal - ordinal-interval Measures: Gamma or Tau-b or Tau-c -Gamma can overestimate strength -Tau-b only for square tables (e.g., 3x3) -Tau-c only for rectangular tables (e.g., 3x4) 3.Interval level -interval-interval Measures: Pearson’s R or Spearman’s rho -Pearson’s R used for linear relationships -Spearman’s rho used for non-linear relationships
36
Pearson's R used for..
Pearson’s R used for linear relationships(interval)
37
Spearman’s rho used for..
Spearman’s rho used for non-linear relationships(interval)
38
Gamma can be used for..
Gamma can overestimate strength(ordinal)
39
Tau-b can be usedfor...
Tau-b only for square tables (e.g., 3x3)(ordinal)
40
Tau-C can be used for..
Tau-c only for rectangular tables (e.g., 3x4)(ordinal)
41
Cramer's law can be used for..
Cramer’s V tends to overestimate strength(nominal)
42
Lambda can be used for...
Lambda can underestimate strength(nominal)
43
Practice at determining lowest level of measurement ``` 1• Gender and feelings about party leader (0-100 feeling thermometer) 2• Age (in years) and feelings about party leader 3• Age (in categories) and feelings about party leader 4• Partisanship and attitudes about oil sands policy (scale) 5• Age (in years) and attitudes about oil sands policy 6• Ideology (left-centre-right) and attitudes about oil sands policy ```
1. gender nominal(cant rank them), feelings interval= lowest is nominal 2. age interval, feeling interval or ordinal= ordinal or interval 3. age ordinal, feelings interval or ordinal=ordinal 4. partisanship nominal(yes or no), attitudes interval=nominal 5. age interval, attitudes are ordinal or interval=ordinal or interval 6. ideology nominal, attitudes ordinal or interval=nominal
44
For each of these questions what would you use to get answers? 1. Is there a relationship? 2. What is the direction of the relationship? (ordinal, interval) 3. What is the strength of the relationship? 4. Is the relationship statistically significant?
1. Is there a relationship? - Contingency table, scatterplot 2. What is the direction of the relationship? (ordinal, interval) - Contingency table, scatterplot 3. What is the strength of the relationship? - Measures of association 4. Is the relationship statistically significant? - Inferential statistics
45
How to examine changes with control variables
``` 1. Consider crosstab tables, measures of association, and inferential statistics for each value of the control variable. 2. Look for changes in measure of association and statistical significance ``` Relationship may: - holds constant - gets stronger - gets weaker/disappear - vary across categories
46
reinforcing variable
Reinforcing variable: strengthens relationship between | independent and dependent variables
47
If relationship weakens after controling for variable..
``` Relationship Weakens/Disappears After Control Suspect either confounding variable (spurious relationship) OR intervening variable ```
48
If relationship strengthens after control for variable...
suspect a Reinforcing variable:
49
Spurious or Intervening?
``` • Answer lies in theory and logic, not in statistical tests • Consider temporal order. • If control precedes IV, suspect spurious. • If control follows IV, suspect intervening. ``` To figure that out you either need a temporal order or you have to look through the theory behind it
50
Multivariate Analysis
when you bring in and test all control variables at once ``` Ask Yourself: • Which variables in the model are statistically significant? • What is the nature of the relationship between an IV and the DV when all other variables are controlled? • What percentage of variation is explained by the model as a whole? ```
51
When we control for avariablethe result may be..4
``` Relationship may: - holds constant ànot relevant to relationship - gets stronger à reinforcing - gets weaker/disappear àspurious or intervening - Vary across categories à conditions relationship ```