Assignment #2 - cat by cat Flashcards

(71 cards)

1
Q

which law says the result obtained
from a large number of trials should be close to the expected result, and will tend
to get closer to the expected result as more trials
are performed.

A

law of large numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

if individual
outcomes are uncertain (up to chance) but there is
nonetheless a regular distribution of outcomes in
a large number of trials, what do we call it?

A

A random phenomenon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

the proportion of times the
outcome would occur in an indefinitely long
series of trials of a random phenomenon

A

Probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

mathematical

descriptions of long run regularity of random phenomenon

A

Probability models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does a probability model consist of? (3)

A

1 - Sample space (s)
2 - Event
3 - Probability of each event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The set of all possible outcomes in a random phenomenon

A

Sample space (s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The probability model for a random phenomenon

with a FINITE number of possible outcomes

A

discrete probability model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you describe a discrete probability model

A

list the possible outcomes and

their associated probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you describe a discrete probability model

A

list the possible outcomes and

their associated probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the probability model of a coin toss

A

S = {Heads (0.5), Tails (0.5)}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In a probability model, what does the value in brackets ( ) represent

A

the probability of that event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The probability model for a random phenomenon

with an infinite number of possible outcomes

A

Continuous probability model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you express a continuous probability model?

A

a continuous curve with the total area under the

curve equal to 1, X-axis = possible outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a continuous curve + axis combo for a continuous probability model called?

A

Probability distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

probability distribution obtained through a large number of samples drawn from the population.

A

Sampling distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

the sampling distribution for p has known

characteristics if the sample size is large enough

A

Central Limit Theorem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

systematic bias that arises from using

nonprobability sampling methods.

A

Sampling methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

he error that occurs when using a
statistic based on a sample to predict the value of a
population parameter.

A

Sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

the probability distribution that specifies probabilities for the possible values a statistic can take.

A

The sampling distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

the standard deviation of the sampling

distribution.

A

Standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

a range of values which is
computed from sample data and might contain
the unknown population parameter.

A

Confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Two components of a confidence interval

A
  1. interval

2. Confidence level C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is the interval component of a confidence interval

A

calculated from the data, usually in the

form ‘estimate ± margin of error.’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the confidence level C portion of a confidence interval

A

our level of
confidence that the interval includes the population
parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the most common confidence interval
95%
26
How do you narrow a confidence interval
Make C smaller, e.g., using C = 90% instead of C = 95% OR make N larger
27
Why would you want to narrow a confidence interval?
to produce a smaller margin of error.
28
Do confidence intervals utilize sample size (n) or population (N)?
sample size (n)
29
6 Elements of a test of significance
1. null and alternative hypotheses 2. test statistic 3. p-value 4. alpha-level 5. conclusion 6. assumptions that must be met
30
a statement about some characteristic of a variable or a collection of variables.
Hypothesis
31
Does a hypothesis pertain to the characteristics of a population or a sample?
Population
32
What does a significance test include?
two specific hypotheses pertaining to the value of a given parameter.
33
the hypothesis that is directly tested. the parameter in question has a value corresponding to ‘no effect’ or ‘no association.’
null hypothesis (Ho)
34
a hypothesis that directly contradicts the null hypothesis. the parameter in question falls in some alternative set of values to that specified by the null hypothesis.
alternative hypothesis (Ha)
35
In a test of significance, which hypothesis do we assume is true?
Null hypothesis (Ho)
36
What does a test of significance analyze?
Strength of evidence AGAINST the null hypothesis
37
Does proving insufficient evidence against Ho mean Ho is true?
No, it just means that we don't have enough proof that its not true
38
a statistic calculated from the sample data to test the null hypothesis.
A test statistic.
39
the probability, when Ho is true, of producing a test statistic value that is at least as contradictory to Ho as the value actually observed.
The p-value
40
The smaller the p-value in a test statistic, what does this mean?
The more strongly the data contradicts Ho.
41
a number such that one rejects Ho if the p-value is less than it
α-level (alpha-level)
42
What is another term for the a-level
he significance level of the test.
43
What are the most common a-level values for significance levels of test statistics
0.05 and 0.01.
44
What do we do if p < alpha (e.g., p < 0.05) ?
We reject Ho and | accept Ha.
45
What do we do if p > alpha (e.g., p > 0.05)
we do not reject Ho and therefore do not accept Ha.
46
Assumptions of a significance test
1. Randomization 2. Type of variable(s) matches the type of test (quantitative, categorical, etc.) 3. Shape of the distribution of a variable in the population (sometimes MUST be normal) 4. Sample size (some tests have minimum)
47
Why would decisions in tests of hypotheses always have some uncertainty?
Sampling error
48
Sampling error when Ho is rejected even though it is actually true.
Type I error (or false positive)
49
Sampling error when Ho is not rejected even though it is actually false.
Type II error (or false negative)
50
For bivariate associations, visually displays the number of observations observed at all the combinations of possible outcomes for the two variables.
A contingency table
51
In a contingency table, the row totals considered together and the column totals considered together
Marginal distributions
52
What are the marginal distributions of a contingency table?
The frequency distributions of the two variables.
53
percentage of the | row total that pertains to each column.
Row percentages
54
the percentage of the column total that pertains to | each row
Column percentages
55
If there is an independent and dependent variable, which percentage is included?
The percentage of the location of the independent variable (i.e., if in rows, use row percentage).
56
a measure of the strength of an association between two categorical variables.
Cramer's V
57
What is the range of Cramer's V?
0 (no association) to 1 (perfect association).
58
A measure of the strength of association between two ordinal categorical variables
Kendall's tau-b
59
Range of kendall's tau b
low of –1 (a perfect negative association) through 0 (no association) to a high of 1 (a perfect positive association)
60
The Chi-squared test of significance is used to help produce what?
Cramer's V
61
What are the hypotheses of the Chi-squared test of significance?
- Ho: X and Y are unassociated in the population | - Ha: X and Y are associated in the population
62
How do we know, looking at a contingency table, if two variables are unassociated in the population?
The percentages in the contingency table between the two variables in the population are the same.
63
What counts is the Chi-squared test of significance based upon?
Observed counts vs expected counts.
64
The count for a cell in the contingency table between X and Y is the number of observations in the sample that fall in that particular cell.
Observed count
65
What does the expected count of a bivariate association equal?
the product of its row total and its column total divided by the total sample size.
66
assesses the magnitude (sum) of the differences between the observed and expected counts.
The Chi-squared test statistic
67
What does the area under the curve to the right of the Chi-squared test statistic calculated for our sample represent?
probability of finding a test statistic | as or more contradictory to the null hypothesis as this one, just by chance, when the null hypothesis is true.
68
Data assumptions for Chi-squared test of significance (3)
1. No more than 20% of the expected counts are less than 5 2. all individual expected counts are 1 or greater. 3. ll four expected counts in a 2 x 2 table should be five or greater.
69
Does a small p-value always = strong association?
No, big difference between statistical | significance and practical significance.
70
What are the hypotheses for the test of significance for Kendall's tau b
Ho: tau-b = 0 in the population Ha: tau-b is not equal to 0 in the population.
71
Which test of significance is more discriminating?
The test of significance for Kendall's tau-b?