Final Flashcards

1
Q

What are the things that exist in the center of a normal curve?

A

Mean, median and mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does an inflection point on a normal curve mark?

A

A standard deviation from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The distributions of most continuous random variables will follow the shape of the ____

A

The distributions of most continuous random variables will follow the shape of the normal curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the empirical rule state?

A
  • 68% of all values fall within 1 standard deviation of the mean
  • 95% of all values fall within 2 standard deviation of the mean
  • 99.7% of all values fall within 3 standard deviations of the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 3 major types of central tendency?

A

Mean, median, and mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

____ refers to the measure used to determine the center of a distribution of data.

A

Central tendency refers to the measure used to determine the center of a distribution of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is central tendency used for?

A

It is used to find a single score that is most representative of an entire data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a data set with 2 modes called?

A

Bi-modal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A data set with more than one mode can be described as ___

A

Multi-modal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

____is mostly used to represent the central tendency, but sometimes outliers can interfere with its usage

A

Mean is mostly used to represent the central tendency, but sometimes outliers can interfere with its usage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an outlier?

A

A value that is very different from the other data in the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a variable?

A

A property that can take on many values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two kind of variables?

A

Quantitative variables and qualitative/categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a quantitative variable and what kind you do with it?

A

Variables measured numerically. With quantitative variables, can do things like add and subtract, multiply and divide, and get a meaningful result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

_____ allow for classification based on some characteristic

A

*Qualitative/ categorical variables allow for classification based on some characteristic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Whta is a discrete variable?

A

A quantitative variable with a finite number of values. Ex: the amount of even numbers on a dice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a continuous variable?

A

A quantitative variable with an infinite number of values Ex: temp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an independent variable?

A

Any variable that is being manipulated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a dependent variable?

A

Any variable that is being measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the four data types of measured variables?

A
  • Nominal
  • Ordinal
  • Interval
  • Ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

_____ data (also known as qualitative/categorical data) is data that is split into categories (dichotomous)

A

Nominal data (also known as qualitative/categorical data) is data that is split into categories (dichotomous)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

____ data is data where order matters, but distance between values does not

A

Ordinal data is data where order matters, but distance between values does not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

____ data is where order matters, and distances between values are qual and meaningful, but there is no natural zero present

A

Interval data is where order matters, and distances between values are qual and meaningful, but there is no natural zero present

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

____ data is data where order matters, distances between values are equal and meaningful, and a natural zero is present

A

Ratio data is data where order matters, distances between values are equal and meaningful, and a natural zero is present

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
___ is best for numeric symmetrically distributed data
*Mean* is best for numeric symmetrically distributed data
26
___ is best for numeric non-symmetrically distributed data
*Median* is best for numeric non-symmetrically distributed data
27
What level of measurement is dichotomous?
Nominal
28
Gender is a ___ level of measurement
Gender is a *nominal* level of measurement
29
Time is a ___ level of measurement
Time is a *ratio* level of measurement
30
Age is a ___ level of measurement
Age is a *ratio* level of measurement
31
What is the simple confidence interval?
A range of values that we are confident contains the population parameter
32
What is point estimate?
A single value that represents the best estimate of the population value
33
In a confidence interval, the width concerns the ___ of the estimate
In a confidence interval, the width concerns the *precision* of the estimate
34
The point estimate is always in the ___ of the confidence interval
The point estimate is always in the *middle* of the confidence interval
35
What is the formal definition of a confidence interval?
If we repeated sampling an infinite number of times, 95% of the intervals would overlap the true mean
36
Not every value in a CI, is equally as ___
Not every value in a CI, is equally as *probable*
37
A more narrow confidence interval means that it is ____ precise
A more narrow confidence interval means that it is *more* precise
38
What are the factors that can narrow/increase a confidence interval?
1. Larger sample size 2. Less variance 3. Lower selected level of confidence (90% vs. 95%)
39
The null hypothesis is ___. And it states that _____
The null hypothesis is *a sampling error*. And it states that *the population means(not sample means) are equal so the difference seen is not real*
40
The alternative hypothesis states that the difference seen, represents __.
The alternative hypothesis states that the difference seen, represents *a real difference*.
41
What is a type 1 error in hypothesis testing? What is its symbol? This is considered a liar
When the null hypothesis is true, and we choose to reject it. Symbol: "Alpha"
42
What is a type 2 error in hypothesis testing? What is its symbol? This is considered to be blind
When the null hypothesis is false, and we do not reject it. (accept it) Symbol: Beta
43
___ is the maximum probability of type 1 error that a researcher is willing to accept
*Alpha* is the maximum probability of type 1 error that a researcher is willing to accept
44
When does the researcher set the alpha?
Set before running statistics
45
What is alpha usually set to?
0.05. (5%)
46
What is the simple definition of a p-value?
The probability of type 1 error if the null hypothesis is true
47
True or false. | You can have a probability of type 1 error what the null hypothesis is false
False You can NOT have a probability of type 1 error what the null hypothesis is false
48
When is the p-value calculated?
After research
49
What is the formal definition of a p-value?
Probability of observing a value more extreme than actual value observed, if the null hypothesis is true
50
If the p-value is less than or equal to alpha, we ___ the null hypothesis
If the p-value is less than or equal to alpha, we *REJECT* the null hypothesis
51
If the p-value is greater than or equal to alpha, we ___ the null hypothesis
If the p-value is greater than or equal to alpha, we *ACCEPT* the null hypothesis
52
If we “fail to reject” (accept) Ho, we attribute any | observed difference to ____ only
If we “fail to reject” (accept) Ho, we attribute any | observed difference to *sampling error* only
53
We don’t interpret non-significant differences as “__” | maybe not even as “trends”
• We don’t interpret non-significant differences as *“real”* (maybe not even as “trends”)
54
We understand that a non-significant difference is | attributable only to __.
We understand that a non-significant difference is | attributable only to *chance.*
55
How do you use confidence intervals for hypothesis testing?
Look at the 95% CI of the mean difference, and evaluate whether or not it includes zero
56
If the confidence interval includes 0, it is ____ in hypothesis testing
If the confidence interval includes 0, it is *nonsignificant* in hypothesis testing
57
If the confidence interval excludes 0, it is ____ in hypothesis testing
If the confidence interval excludes 0, it is *significant* in hypothesis testing
58
What is the benefit of a CI over a p-value when hypothesis testing?
CIs give an estimate of effect size
59
P-values and CIs tells us about ___ not ____
P-values and CIs tells us about *statistical significance not clinical significance*
60
What is statistical power?
The probability of finding a statistically significant difference if such a difference exists in the real world
61
What are the main things that affect the statistical power of a study?
- Alpha - Effect size - Variance - Sample size
62
Increasing alpha will ___ power
Increasing alpha will *increase* power
63
An effect size is known as the ____
An effect size is known as the *mean difference*
64
What is standardized effect size?
The mean difference divided by the variance
65
__ is the spread of scores
*Variance* is the spread of scores
66
Increasing the effect size will ___the power
Increasing the effect size will *increase* the power
67
Increasing the sample size will ___the power
Increasing the sample size will *increase* the power
68
___ is the best way to increase statistical power
*Sample size* is the best way to increase statistical power
69
Increasing variance will ___ power
Increasing variance will *decrease* power
70
What are the things that will decrease power?
- Decreased alpha - Decreased effect size - Increased variance - Decreased sample size
71
What are the two types of power analysis?
- Power a priori | - Power post-hoc
72
What is power a priori?
A power analysis done before we collect data, to determine if the design is powerful enough
73
What is power post-hoc?
Power analysis done after the research is complete by the consumers to find if there was enough power/ if they failed to reject the null hypothesis
74
If a difference is found post-hoc/the null hypothesis was rejected, then the power issue is ___
If a difference is found post-hoc/the null hypothesis was accepted/fail to reject, then the power issue is *moot/not a problem*
75
If a difference not is found post-hoc/the null hypothesis was accepted/fail to reject, then the power issue is ___ and you have to do a ___
If a difference not is found post-hoc/the null hypothesis was accepted/fail to reject, then the power issue is *huge* and you have to do a *post-hoc analysis*
76
A priori is used to figure out how many subjects to use ___
A priori is used to figure out how many subjects to use *before a study is started*
77
What is the minimal accepted power during power a priori?
0.8
78
What are the 2 ways to determine a post doc analysis?
1. Compute with traditional cohen approach | 2. Determine with confidence interval analysis of effect size
79
What is involved in computing the post doc analysis with the traditional approach?
``` • Continuous scale result: 0.0 – 1.0 ( > 0.8 is default) • Based on: • Sample size • Alpha • Variance (observed) • Effect size (use MCID, not observed) ```
80
____ is the better way to determine the post hoc analysis, while with ____, the answer will probably be the same as a priori
*Determine with confidence interval analysis of effect size* is the better way to determine the post hoc analysis, while with *compute with traditional cohen approach*, the answer will probably be the same as a priori
81
If the MCID is excluded from the CI, then it is definitively negative and ___ powered
If the MCID is excluded from the CI, then it is definitively negative and *adequately* powered
82
If the MCID is included from the CI, then it is not definitive and ___ powered
If the MCID is included from the CI, then it is not definitive and *inadequately* powered/ underpowered
83
A two tailed testis testing to see ____
A two tailed testis testing to see *if your calculated value is either above or below where it is expected to be*
84
A one tailed test is testing to see if ____ or ___
A one tailed test is testing to see if *your calculated value is above where it's expected to be or below where it is expected to be*
85
___ is the assumption you're beginning with and is opposite of what you're testing
*Null hypothesis(H0)* is the assumption you're beginning with and is opposite of what you're testing
86
___ is the claim you're testing
*Alternating hypothesis* is the claim you're testing
87
What is a t-statistical test?
Statistical method to decide whether an observed difference in sample scores represents a “real” difference in the population…. vs. just sampling error
88
How many groups are in a t-test?
2 groups
89
2 groups is another way of saying...?
2 levels of 1 IV
90
What does a t-test do?
Finds the difference between group means divided by the variability within the groups( standard error of the mean difference)
91
The error in a standard error refers to...?
All sources of variability within a set of data | that cannot be explained by the independent variable.
92
A within group variability with no variability is known as being ___ ?
A within group variability with no variability is known as being *definitely different* ?
93
A within group variability with little bit of variability is known as ___ ?
A within group variability with little bit of variability is known as *probably different*
94
A within group variability with larger amounts of variability is known as ___
A within group variability with larger amounts of variability is known as *maybe not different*
95
When the variability between groups are not necessarily the same, it is called...?
When the variability between groups are not necessarily the same, it is called *a differing variance*
96
What is a parametric statistics?
A branch of statistics which assumes that sample data comes from a population that follows a probability distribution based on a fixed set of parameters.
97
What are the basic assumptions for all parametric test?
* Samples are randomly drawn from populations * Population is normally distributed * Homogeneity of variance (roughly) * Data from ratio or interval (i.e. continuous) scales
98
What rarely happens, but one still needs to be careful with when samples are randomly drawn from populations?
Generalization
99
What are the ways to test if the population is normally distributed?
- Statistically - Graphically - Common sense
100
When is the homogeneity of variance especially important?
With unequal group sizes
101
How is the homogeneity of variance tested?
Statistically
102
What statistical test is used for the t-test?
Levene's test
103
What are the statistical hypotheses for the null hypothesis for a two-level design?
- The two population means are equal - The hypothesis can be in a nondirectional format (not equal) - Directional format (one is greater than the other)
104
A two-tailed test uses a ___ hypothesis
A two-tailed test uses a *nondirectional* hypothesis
105
A one-tailed test uses a ___ hypothesis
A one-tailed test uses a *directional* hypothesis
106
A two tailed test has ___ statistical power compared to the one tailed test
A two tailed test has *less* statistical power compared to the one tailed test
107
What are the two types of t-test?
- Independent/unpaired t-test | - Paired t-test
108
What happens in an unpaired(independent) t-test?
Testing to see if there is a difference between 2 groups
109
What kind of design is found in an unpaired t-test?
- Pretest-posttest design (compare change scores) | - Posttest only design
110
What happens in a paired(dependent) t-test?
Testing to see if there is a difference between conditions in the same person
111
What kind of design is found in a paired t-test?
- Difference scores or pretest-posttest | - Repeated measures design
112
A repeated measures factor is an example of a ___
A repeated measures factor is an example of a *within-subjects factor*
113
A non-repeated measures factor is an example of a ____ factor
A non-repeated measures factor is an example of a *between-subjects* factor
114
What is an ANOVA?
Statistical method to decide whether an observed difference in sample scores represents a “real” difference in the population…. vs. just sampling error, but with 3 or more groups/levels of 1 IV and or 2 or more IVs
115
What is the question asked in an ANOVA?
Are observed differences in whole set of means greater than would be expected by chance alone?
116
What statistic is looked at for ANOVA?
An f- statistic
117
What is an F-statistic?
The between group variability divided by the within group variability
118
What is the null hypothesis in the ANOVA?
All of the population means are even
119
What is the alternative hypothesis in the ANOVA?
At least one pair of samples is significantly different, but we don't know which one
120
What are the basic assumptions for ANOVA?
* Samples are randomly drawn from populations * Population is normally distributed * Homogeneity of variance (roughly) * Data from ratio or interval (i.e. continuous) scales
121
What does one need to be careful with when randomly drawing samples from the population?
Generalization
122
How can the normal distribution of a population be tested?
- Statistically - Graphically - Common sense
123
When is the homogeneity of variance especially important?
When there is an unequal group size
124
How is the homogeneity of variance usually tested?
Statistically
125
The types of ANOVA concern what...?
- Whether they are one way (1 IV) or multiple ways | - Whether the IV are between subjects(independent groups) or within subjects (repeated measure) or a mixed model
126
What is a mixed model?
Where there is 1 IV that is between subject and 1 IV that is within subjects
127
What are the types of ANOVA?
- One way ANOVA: independent samples - Two way ANOVA: independent samples - One way ANOVA: Repeated measures samples - Two way ANOVA: Repeated measures samples
128
What is the characteristic of a one way ANOVA: independent variable?
1 IV with 3 or more levels
129
What does the result of an ANOVA show?
Whether or not there is a difference overall, but not where the difference is
130
What is the characteristic of a two way ANOVA: independent variable?
2 or more IV
131
What are the things you're interested in when performing a two way ANOVA: independent variable?
- Main effect of IV A - Main effect of IV B - Main effect of IV A & B (interaction effect)
132
What is the interaction effect?
Saying that the scores across one of the IV depends on the levels of the other IV
133
It is really helpful to look at ____ when talking about interaction effects
It is really helpful to look at *graphs* when talking about interaction effects
134
What does it mean when the lines of an interaction effect graph are parallel?
There is no interaction
135
What does it mean when the lines of an interaction effect graph are not parallel?
There is an interaction
136
What is a disordinal interaction?
When the lines cross and significant main effects cannot be interpreted
137
What is an ordinal interaction?
When the lines don't cross and significant main effects can be interpreted
138
The one way ANOVA: Repeated measures samples is more powerful that the independent ANOVA because ___
The one way ANOVA: Repeated measures samples is more powerful that the independent ANOVA because *it has less error variance*
139
What is the homogeneity of variance in the one way ANOVA: Repeated measures samples?
Sphericity
140
What is sphericity?
The homogeneity of variance of differences
141
How is sphericity tested?
Test with Mauchly’s Test of Sphericity
142
What is a non-significant finding of sphericity mean?
No difference in variance
143
If sphericity assumption is failed, what happens?
Use correction/adjusted p-value
144
What is a multiple comparison test used for?
To determine where the difference is
145
The multiple comparison test is also called the ____
The multiple comparison test is also called the *pairwise comparisons*
146
What are the different strategies of performing a multiple comparison test?
1. Post-hoc | 2. Planned comparison
147
When is a post-hoc performed?
Performed after ANOVA
148
___ multiple comparison strategy is the most common
*Post-hoc* multiple comparison strategy is the most common
149
The post hoc test ___ and therefore are exploratory
The post hoc test *every difference* and therefore are exploratory
150
When is a planned comparison performed?
Performed instead of ANOVA (a priori)
151
What does a planned comparison focus on?
Focused only on specific comparisons
152
How do you calculate the family wise type 1 error rate that is used for the one way ANOVA?
Add up all the alpha values
153
When the family wise type 1 error rate is too high, what do you do?
A Bonferroni Correction can be done
154
How is a Bonferroni Correction done?
Divide alpha by the number of statistical tests to be performed and use that for each post hoc test
155
What is the downside to the Bonferroni Correction?
Because it has less power and a higher chance of a type 1 error, must balance risk of Type 1 and Type 2 error
156
What are the types of post hoc test to perform in the order of least conservative/most likely to find a significant difference?
- Fisher's least significant difference - Duncan multiple range test - Newman-Keuls method - Tukey's honestly significance difference - Bonferroni t-test - Scheffe's comparison
157
What are the post-hoc test that are performed the most?
- Fisher's least significant difference - Tukey's honestly significance difference - Bonferroni t-test
158
What is the Fisher's least significant difference test?
Essentially and unadjusted t-test (LSD)
159
Why is the Tukey's honestly significance difference important?
“Middle of the road” in terms of risk and most commonly used
160
What does the Bonferroni t-test do?
Simply divides α by # of | comparisons
161
When is the Fisher's least significant difference test, Tukey's honestly significance difference important, and Bonferroni t-test suitable for use?
When an independent groups type test is being performed
162
What are the multiple comparison test to be used for repeated measures?
- LSD - SIdak - Bonferoni correction
163
LSD is an _____
LSD is an *unadjusted paired t-test*
164
Sidak is ___
Sidak is *adjusted, but good balance of type 1 & type 2 error protection*
165
The LSD test has a high risk of ___, type 1 error meaning it is less conservative
The LSD test has a high risk of *high*, type 1 error meaning it is less conservative
166
The bonferoni correction test has a high risk of ___ error and is more conservative
The bonferoni correction test has a high risk of *type 2* error and is more conservative
167
What is an ANCOVA?
(Analysis of covariance) is a statistical technique that is used when you cannot control a variable through research design and sampling
168
What does the ANCOVA do?
It statistically adjust the dependent variable based on the covariate
169
ANCOVA produces ____
ANCOVA produces *adjusted means*
170
ANCOVA is a combination of ___ and _____
ANCOVA is a combination of *ANOVA and linear regression*
171
What are the assumptions of ANCOVA?
- Usual parametric assumptions - Linear relationship between CoV and DV (with r>.6) - Homogeneity of slopes
172
You can also use ANCOVA to adjust for ____ scores
You can also use ANCOVA to adjust for *baseline* scores
173
When do you do a non-parametric test?
When the basic assumptions for a parametric test are not met
174
Non- parametric statistics are based on...?
* Comparisons of ranks of scores | * Comparisons of counts(yes/no) or “signs” of score
175
Non- parametric statistics are ___ compared to parametric statistics
Non- parametric statistics are *less powerful* compared to parametric statistics
176
What kind of parametric test do you perform when you have 2 independent groups?
Unpaired t-test
177
What kind of parametric test do you perform when you have 2 related scores?
Paired t-test
178
What kind of parametric test do you perform when you have 3 or more independent groups?
One-way analysis of variance (ANOVA) (F)
179
What kind of parametric test do you perform when you have 3 or more related scores?
One-way repeated measures analysis of variance (MANOVA)
180
What kind of non-parametric test do you perform when you have 2 independent groups?
Mann-Whitney U test
181
What kind of non-parametric test do you perform when you have 2 related scores?
- Sign test | - Wilcoxon signed ranks test (T)
182
What kind of non-parametric test do you perform when you have 3 or more independent groups?
- Kruskal-Wallis analysis of variance by ranks (H or x^2)
183
What kind of non-parametric test do you perform when you have 3 or more related scores?
Friedman two way analysis of variance by ranks
184
True or False You're able to perform a non-parametric test on complex designs like a 2 x 3
FALSE Unable to perform on more complex designs (e.g. 2x3)
185
What question is being asked in the comparison based on ranks in a non-parametric t-test?
Is the difference in ranks larger than would be expected by chance alone?
186
What question is being asked in the comparison based on signs in a non-parametric t-test?
Is the difference in sign frequencies larger than would be expected by chance alone?
187
What type of test do we use when the IV and DV are both on the nominal level?
Chi- Square
188
What are you looking at in a chi-square?
Are observed frequencies different than expected frequencies
189
What are the 2 types of chi square?
* Goodness of fit | * Tests of independence (association)
190
What do you do in the goodness of fit chi square test?
• Compare observed frequencies of 1 variable to uniform frequencies of another
191
What is an example of the goodness of fit chi square test?
• Eg: flip coin 50 times. Get 15 heads & 35 tails. Is this difference due to chance or a “real” bias?
192
____ chi square test is much more common?
Tests of independence (association)
193
What do you do in the tests of independence (association) chi square test?
Compare observed frequencies from 1 variable to observed frequencies of another variable
194
What is an example of the tests of independence (association) chi square test?
Eg: Is owning a mac laptop related to gender?
195
What is the McNemar test?
Requirement of chi-square is that variable levels must be independent (e.g. can’t be “healed” and “unhealed”)
196
___ is the form of a chi square test that is used for 2x2 with correlated sample
McNemar test* is the form of a chi square test that is used for 2x2 with correlated sample
197
What is a phi coefficient?
A correlation coefficient for 2 nominal variables/ degrees of association for 2x2
198
The phi coefficient is based off the ___
The phi coefficient is based off the *chi-square test*
199
What is the IV level of measurement for a t- test?
Nominal
200
What is the IV level of measurement for an ANOVA?
Nominal
201
What is the IV level of measurement for a non parametric test?
Nominal
202
What is the DV level of measurement for a t- test?
Continuous
203
What is the DV level of measurement for an ANOVA?
Continuous
204
What is the DV level of measurement for a non parametric test?
Ordinal
205
What is the question asked with a t-test?
Difference between means?
206
What is the question asked with an ANOVA?
Difference between means?
207
What is the question asked with a non parametric test?
Ranks different?
208
What is the IV level of measurement for a correlation?
Continuous
209
What is the IV level of measurement for a regression?
Continuous
210
What is the DV level of measurement for a correlation?
Continuous
211
What is the DV level of measurement for a regression?
Continuous
212
What is the question asked with a correlation?
Strength of association?
213
What is the question asked with a regression?
Strength of prediction?
214
What does a correlation have to do with?
A pair of scores and how much they co-vary
215
What does it mean for something to co-vary?
Directly or inversely proportional. When one is high, so is the other and vice versa
216
What are the things that a correlation looks at?
* Do they vary together (covary)? * How strong is their linear relationship? * What is the nature of the relationship?
217
A correlation has to be ___
A correlation has to be *linear*
218
What is a correlation coefficient?
A number that quantifies the strength of a linear relationship that can range from -1 to 1
219
What does it mean when a correlation coefficient is closer to 1, whether positive or negative?
Closer to |1.00|, higher strength of relationship
220
What does the sign of the correlation coefficient indicate?
The direction
221
The tighter the grouping of the linear relationship, the ___ the correlation coefficient
The tighter the grouping of the linear relationship, the *higher* the correlation coefficient
222
What does a 0.00- 0.25 coefficient correlation mean?
Little or no relationship
223
What does a 0.26- 0.50 coefficient correlation mean?
Fair relationship
224
What does a 0.51- 0.75 coefficient correlation mean?
Moderate to good
225
What does a 0.75- 1.00 coefficient correlation mean?
Good to excellent
226
What is the coefficient of determination?
The square of the correlation coefficient
227
What is the coefficient of determination equal to?
The percent of variance in one variable that is explained (or accounted for) by the other variable
228
What is the significance of the coefficient correlation?
To test the null hypothesis
229
What is the null hypothesis as it relates to the coefficient correlation?
The correlation between variable x and variable y is not significantly different from zero.
230
Coefficient correlation is very sensitive to ___
Coefficient correlation is very sensitive to * sample size*
231
What is the most common type of correlation coefficient?
Pearson Product-Moment Correlation Coefficient (r)
232
When is the Pearson Product-Moment Correlation Coefficient applicable?
When both variables continuous (Interval or Ratio scale)
233
What is the Spearman Rank (rho) Correlation Coefficient (rs)?
Non-parametric analog of Pearson r
234
When is the Spearman Rank (rho) Correlation Coefficient (rs) applicable?
When 1 continuous, 1 ordinal variable or 2 ordinal variables
235
When do you use a Point Biserial Correlation (rpb)?
When one variable is dichotomous, and the other variable continuous (interval or ratio)
236
When does a Point Biserial Correlation (rpb) not work?
dichotomous nominal (e.g Age & Race)
237
Computationally, a Point Biserial Correlation (rpb) is the same as a ___
Computationally, a Point Biserial Correlation (rpb) is the same as a *Pearson’s r*
238
The results of a Point Biserial Correlation (rpb) is the same as ___
The results of a Point Biserial Correlation (rpb) is the same as *a t-test*
239
When do you use a Rank Biserial Correlation (rrb)?
When one variable is dichotomous (nominal), and the other variable is ordinal
240
A Rank Biserial Correlation (rrb) is computationally about the same as ___
A Rank Biserial Correlation (rrb) is computationally about the same as *Spearman Rank*
241
When do you use a Phi coefficient (Φ)?
When both variables dichotomous
242
A Phi coefficient (Φ) is computationally same as ___ (special case)
A Phi coefficient (Φ) is computationally same as *Pearson’s r* (special case)
243
A scatterplot is ___ with a Phi coefficient (Φ)
A scatterplot is *worthless* with a Phi coefficient (Φ)
244
Can a Phi coefficient (Φ) work with a non- dichotomous nominal?
NO
245
A Phi coefficient (Φ) is similar to a ____, but unlike it, a Phi coefficient (Φ) gives gives strength of relationship, while the ___ only gives statistical significance
A Phi coefficient (Φ) is similar to a *chi square test*, but unlike it, a Phi coefficient (Φ) gives gives strength of relationship, while the *chi-square test* only gives statistical significance
246
A correlation does not tell you ___
Does NOT assess differences or agreement
247
How can an extreme outlier affect the interpretation of a correlation?
Can create inflated correlation with only a few extreme data points
248
Can a correlation data be generalized beyond the range of scores in the sample?
Can’t generalize beyond range of scores in sample
249
Low correlation may be due to ___ range
Low correlation may be due to limited range
250
What is reliability?
Extent to which a measurement is consistent and free from error
251
What can a reliable measurement be expected to do?
A reliable measure can be expected to repeat the same score on two different occasions provided that the characteristic of interest does not change
252
Reliability is closely tied to the concept of ___
Reliability is closely tied to the concept of *measurement error*
253
What are the continuous data reliability coefficients?
* Pearson correlation (r) | * Intraclass correlation coefficient (ICC) (best)
254
What are the discrete/ categorical data reliability coefficients?
* Percent agreement | * Kappa (best)
255
What are the problems with using a Pearson correlation (r) to quantify reliability?
1. Assesses relationship, not agreement | 2. Only two raters or occasions could be compared
256
Why do we prefer to use ICCs and Kappa for quantifying reliability?
Both ICCs and kappa give single indicators of reliability that capture strength of relationship plus agreement in a single value
257
____ is stated in terms of variance
*Reliability coefficients* is stated in terms of variance
258
What is the range of a reliability coefficient and what does it mean?
Range 0-1 0 = no reliability, 1 = perfect reliability
259
The more error variability you have, the ____ reliability coefficient will be
The more error variability you have, the *lower* your reliability coefficient will be
260
Reliability coefficient will be bigger, when ___ is larger
Reliability coefficient will be bigger, when *true variance* is larger
261
What is the equation for the reliability/ correlation coefficient?
True score variability divided by true score variability plus error variability
262
What does a high error variability do to correlation coefficient?
It will reduce it
263
What will not having enough true score variability do to correlation coefficient?
It will reduce it
264
What will happens to correlation coefficient with a large true variance?
It will be bigger
265
What are the things that an ICC measures?
Measures degree of relationship (association) and | agreement simultaneously
266
ICCs give ____ estimate of reliability (can compare different things)
ICCs give *standardized* estimate of reliability
267
ICC is often reported in conjunction with ____
ICC is often reported in conjunction with * Standard error of the measurement (SEM)*
268
ICC is designed for____ data but can be used with ___ data
ICC is designed for *interval/ ratio* data but can be used with *ordinal* data
269
When can can ICC be used with ordinal data?
If intervals “assumed” to be equivalent
270
SEM gives ____ estimate of reliability (i.e. in units | of measurement)
SEM gives “unstandardized” estimate of reliability (i.e. in units of measurement)
271
The 6 types of ICC dependent on ....?
* Purpose of study * Design of study * Type of measurements taken
272
ICC type defined by ___
ICC type defined by *two numbers in parentheses*
273
What does each number in the parenthesis of an ICC type mean?
The first number is the model and the second number is the form. (2, 6) 2 = model, 6 = form
274
How many models of ICC are there?
3
275
What is model 1 of an ICC?
* Each subject measured by a different set of raters; raters “randomly” chosen * Rarely used in clinical research
276
What is model 2 of an ICC?
Each subject measured by same raters; raters “randomly” chosen & representative of rater population; results generalizable
277
What is ICC model 2 commonly used for?
Most common for inter-rater reliability or test-retest reliability
278
What is model 3 of an ICC?
Each subject measured by same rater(s); raters are only ones of interest; results not generalizable
279
What is ICC model 3 commonly used for?
Most common for intra-rater reliability
280
Rank the models of ICC in order from most conservative to least conservative
- Model 1 (most conservative, lowest number) - Model 2 (neutral) - Model 3 (least conservative, highest number)
281
When can a model ICC be used for inter rater reliability?
Can be for inter-rater reliability if study raters only ones of interest
282
What does the form/ 2nd number in parenthesis of an ICC represent?
Second number in parentheses represents number of observations used to obtain reliability estimate
283
When is form = 1?
If only one observation per subject per rater (or rating)
284
When is form a number more than 1?
If multiple observations averaged to get single number for analysis, form = number of observations averaged
285
What ICC is best for clinical measures?
ICC > 0.90
286
What ICC has good reliability?
ICC > 0.75
287
What ICC has poor to moderate reliability?
ICC < 0.75
288
The interpretation of an ICC depends on ____
The interpretation of an ICC depends on *intended use*
289
ICC estimate based on ____ will always be substantially higher than estimate based on ____
ICC estimate based on *average measures* will always be substantially higher than estimate based on *single measure*
290
What are the characteristics of reliability for categorical scales?
* Based on frequency table * Agreements on on diagonal * Disagreements are all others
291
What is percent agreement?
How often the raters agree
292
How do you calculate percent agreement?
Divide number of agreements by total of all possible agreements
293
What is the problem with a percent agreement?
* Does not account for agreement due to chance | * Tends to overestimate reliability
294
What is the kappa coefficient?
Proportion of agreement | between raters after chance agreement has been removed
295
On what kind of data is a kappa coefficient used?
Can be used on both nominal and ordinal data
296
What does a weighted kappa do?
Can choose to make “penalty” worse for larger disagreements
297
What can the weight of a weighted kappa be?
Weights can be arbitrary, and | symmetric or asymmetric
298
A weighted kappa is best for what kind of data?
Best for ordinal data
299
The kappa interpretation depends on ____
The kappa interpretation depends on *the weights used*
300
What does a kappa value of <0.4 mean?
Poor to Fair agreement beyond chance
301
What does a kappa value of 0.4–0.6 mean?
Moderate agreement beyond chance
302
What does a kappa value of 0.6–0.8 mean?
Substantial agreement beyond chance
303
What does a kappa value of 0.8–1.0 mean?
Excellent agreement beyond chance
304
Internal consistency is often used to do what?
Often used to construct and evaluate scale / questionnaires
305
What does internal consistency estimate?
Estimate how well the items that reflect the same construct yield similar results. So, do different questions measure same concept or indicator?
306
What does cronbach's alpha (a) do?
Represents correlation among items and correlation of each individual item with the total score
307
What is recommended that cronbach's alpha be between?
Recommended that cronbach’s alpha be between 0.70 to 0.90
308
Cronbach's alpha can have ___ or ____ on test/questionnaire
Cronbach's alpha can have *dichotomous or multiple-choice responses* on test/questionnaire
309
What can cronbach's alpha (a) help eliminate?
Can help eliminate items from test/questionnaire that are not homogenous to the set or are not contributing unique information
310
What is response stability?
A way to quantify stability of repeated measures over time
311
Response stability is basically the same as ___
Response stability is basically the same as *test-retest reliability*
312
What are the different ways to test response stability?
* SEM: standard error of the measurement * MDC: minimal detectable difference/change * CV: coefficient of variation
313
Standard error of measurement is a ___ measure of reliability, while ICC and kappa is a ____ measure of reliability
Standard error of measurement is a *absolute* measure of reliability, while ICC and kappa is a *relative* measure of reliability
314
SEM is in units of _____
SEM is in units of *measurement as variable*
315
What is SEM theoretically?
Standard deviation of the distribution of theoretical multiple measurements
316
An SEM can be used to create a ____
An SEM can be used to create a *95% CI around a measurement*
317
What is the MDC?
Amount of change in a variable that must be achieved to reflect a true change/difference
318
___ is a mathematical multiple of SEM
*MDC* is a mathematical multiple of SEM
319
What is the coefficient of variation (CV)?
A standardized way to measure variability. (SD divided by the mean times 100)
320
What is the coefficient of variation helpful in comparing and why?
Unit-less, so is helpful comparing variability between two distributions on different scales
321
What is an alternate form reliability?
Comparing different methods of testing same phenomenon with different instruments (goniometer vs inclinometer)
322
What analysis or agreement is seen with an alternate form reliability?
- Limit of agreement | - Bland- altman analysis
323
What is a bland- altman plot?
When you plot the mean of two measures on the x- axis and the difference between the 2 measures on the y- axis, and the center of the plots is a bias
324
What does a tighter range on the bland altman plot mean?
There is more agreement between the two measures
325
When is there no bias on a bland altman plot?
When the line of bias is at 0
326
When is there a consistent bias on a bland altman plot?
When the points on the plot are on one side of the bias line
327
When is there an asymmetrical bias on a bland altman plot?
When the points are split between the two sides of the bias line
328
What is epidemiology?
A study aimed at studying determinants of disease, injury or dysfunction in populations
329
Epidemiology is another way of saying ____
Epidemiology is another way of saying *risk*
330
Risk in PT can be expressed in terms of _____
• Experiencing an adverse outcome • Patients not improving with treatment • Requiring more invasive or expensive subsequent interventions in spite of treatment
331
Epidemiology generally uses observational designs with ___ variables
Epidemiology generally uses observational designs with *dichotomous* variables
332
What studies are intended to study risk factors?
Case-Control & Cohort Studies
333
Case-Control & Cohort Studies looks at the ____ between disease & exposure
Case-Control & Cohort Studies looks at the *association (“cause”)* between disease & exposure
334
The IV and DV in case-control & cohort studies are what kind of variables?
Dichotomous
335
In case-control & cohort studies, there is ___ strength in thinking something is causal of the other
In case-control & cohort studies, there is *less* strength in thinking something is causal of the other
336
How are subjects in a cohort study selected?
Subjects selected based on | exposure or not
337
Is a cohort study usually prospective or retrospective?
Usually prospective, but | can be prospective or retrospective
338
Does a cohort study work for rare conditions?
Doesn’t work well for very | rare conditions
339
What does a cohort study examine?
Examine if there is a different | incidence of disease
340
How are subjects in a case control study selected?
Subjects selected based on whether or not they have disorder
341
Where should the controls of a case control be selected from?
Controls should be selected | from same population as Cases
342
What does a case-control study examine?
Examine if exposure is different between cases and control
343
What condition does a case control work especially well for?
Works especially well for very | rare conditions
344
What are the primary ways to quantify risk?
* Relative Risk (RR) | * Odds Ratios (OR)
345
What do the primary ways to quantify risk actually quantify?
Both quantify strength of association between “exposure” and “disease”
346
In what study is RR used and in what study is OR used?
* RR in Cohort studies | * OR in Case-control studies
347
What does it mean when an RR or OR = 1 ?
* = “null value” | * No association between an exposure and a disease
348
What does it mean when an RR or OR > 1?
* A positive association between an exposure and a disease | * The exposure is considered to be harmful
349
What does it mean when an RR or OR < 1?
* A negative association between an exposure and a disease | * The exposure is protective
350
RR is the ratio of ___ compared to ____
Incidence of disease among exposed individuals compared to Incidence of disease among unexposed individuals
351
Since OR is selected based on whether they have disease or not, so can’t determine rate of ___
Since OR is selected based on whether they have disease or not, so can’t determine rate of “incidence”
352
OR is the ratio of ___ compared to ____
Odds of exposure among cases (with disease) compared to Odds of exposure among controls (w/o disease)
353
The computation of OR is kinda like ___
The computation of OR is kinda like *kappa*
354
____ uses relationships (correlation) as a basis for prediction
*Regression* uses relationships (correlation) as a basis for prediction
355
What are the characteristics of a linear regression?
``` X and Y are correlated • X = independent variable (= predictor variable) • Y = dependent (or criterion) variable • We use X to predict Y • The value of Y depends on X • (Thats why Y is called the dependent variable) ```
356
What is the error from line/ residual in a regression line?
The distance between each data point and the line of best fit
357
Residuals are squared to eliminate ___ and penalize for ___
Residuals are squared to eliminate *sign and penalize for worse errors*
358
What is the line of best fit?
Line with least squared errors
359
Is regression a parametric or non parametric statistic?
Parametric
360
What are the assumptions of a linear regression analysis?
1. Linear relationship = approximation of true line in population 2. For every X there is a normal distribution of Y • Sample data include random samplings from these distributions on Y 3. Homogeneity of variance
361
What is a way to test the assumptions of a linear regression?
Analysis of residuals by: Plot Residuals on Y-axis, vs predicted values on x-axis
362
What assumption of linear regression does the analysis of residuals test the most?
Homogeneity of variance
363
What are you looking for in the analysis of residuals to test linear regression assumptions (assumptions are met)?
Looking for the residual's distance between the predictive value and the actual value be symmetric and consistent throughout
364
What does the analysis of residuals graph look like when the assumptions of linear regression are not met?
- The graph starts to get wider the further it goes(data is further away from the line, the higher you go) - Data is not symmetric
365
What happens if the linear regressions assumptions are not met?
Use a non linear regression
366
What are the thing that helps a researcher determine whether to retain or discard a data with an outlier?
• Due to peculiar circumstances? • Can discard if error identified • Generally not justified on statistical grounds alone
367
What are the peculiar circumstances that have to be taken into consideration when determining whether to retain or discard a data?
* Measurement error * Recording error * Equipment malfunction * Miscalculation * Aberrant subject (should have been excluded)
368
What are the things that looks a the accuracy of prediction of the regression equation?
• Correlation coefficient (R) Coefficient of determination (R2) • ANOVA of Regression
369
What are the characteristics of a correlation coefficient as it relates to the accuracy of prediction?
* Rough indicator of goodness of fit for regression line | * Same as correlation coefficient (r)
370
What does the coefficient of determination represent?
Proportion of variance in Y scores that can be explained by X scores
371
What does the ANOVA of regression test?
Tests hypothesis that predictive relationship occurred by chance (Ho: b = 0)
372
What does it mean when b=0 in an ANOVA of regression?
If b (slope) = 0, line is horizontal = no relationship
373
What happens when p< than alpha in an ANOVA of regression?
If p < than alpha, reject the null and conclude the predictive relationship is significant
374
How many predictors are in a simple linear regression model and how many are in a multiple linear regression model?
There is only 1 predictor in a simple model and there are multiple predictors in a multiple linear regression model
375
What are the assumptions of a multiple linear regression analysis?
1. Linear relationship = approximation of true line in population 2. For every X there is a normal distribution of Y • Sample data include random samplings from these distributions on Y 3. Homogeneity of variance 4. DV = continuous measure
376
Coefficient of determination is the square of ____
Coefficient of determination is the square of *correlation coefficient*
377
What is an adjusted R squared and what do you get punished for?
Chance corrected R2, get punished for having more predictor variables
378
What is the goal of a linear regression?
The more you can predict with fewer variables, the better
379
What is a regression coefficient?
* The value/slope in the linear equation | * The rate of change in Y for each unit change of X
380
What is a standardized beta weight helpful for?
Helpful to know relative contribution of each predictor | variable
381
Which will always be higher or the same, out of an R square or an adjusted R square?
The R square will always be higher than or equal to the adjusted R square
382
What is multicolinearity?
When the Xs in the model are substantially correlated with each other
383
What does multicolinearity create a problem with?
Creates problems with interpretations of b weights
384
What is the risk of the force entry of all possible predictors in a multiple regression method?
* Risk of multicolinearity (correlation between predictors) * Risk of retaining non-contributing predictors * Risk of more predictors than justified by sample size
385
How is the criteria in a stepwise procedure set?
Criteria set to retain or reject predictors
386
Which predictor is entered first in a stepwise procedure?
Predictor with highest partial correlation entered first
387
What does a stepwise procedure result in?
Should result in model with greatest parsimony and | least multicolinearity
388
What is a parsimony model?
A model that is the most predictive, with the least amount of variables
389
What is a simple correlation?
The overlap between 2 variables
390
What is a partial correlation?
The unique correlation between 2 variables
391
What is a forward stepwise regression method?
A method that starts with no predictors, then adds them, starting with the strongest
392
What is a backward stepwise regression method?
A method that starts with all predictors, then removes them, starting with the weakest
393
What is a stepwise stepwise regression method?
A method that starts with no predictors, then add, | but can also remove
394
What is the level of measurement for predictors/ IV in a stepwise multiple linear regression model?
* Most predictors are continuous scales * Can also use dichotomous or ordinal scale predictors * But not multicategory nominal (e.g. race)
395
A large number of predictors is needed in a stepwise multiple linear regression hence it requires ___
A large number of predictors in a regression requires *a very large sample size*
396
What is the rule of thumb for the predictors of a stepwise multiple linear regression model?
At least 10-15 subjects per predictor in model
397
What happens if there are too many or too few predictors in a stepwise multiple linear regression model?
Become susceptible to “model overfit” (chance associations, i.e. type 1 error).
398
What is a logistic regression?
When you are trying to predict a dichotomous variable
399
What is the DV level of measurement of a logistic regression?
Dichotomous
400
What is the predictor/ IV level of measurement of a logistic regression?
Continuous, ordinal, or dichotomous
401
What are the pros MANOVA?
• MANOVA gets around multiplicity problem (familywise alpha: increased Type I error risk) • MANOVA can be more powerful if DVs related
402
What are the cons MANOVA?
• “Combo DV” is not directly interpretable • If statistically significant, then must follow up with post-hoc ANOVAs
403
What is a factor analysis?
Method of simplifying & organizing large sets of variable into fewer abstract components
404
What is a path analysis?
Visual modeling of both direct & indirect relationships
405
Path analysis is an extension of ____
Path analysis is an extension of *multiple regression*
406
Compared to a multiple regression, a path analysis is more __ and ____
Compared to a multiple regression, a path analysis is more *flexible and comprehensive*
407
What can a path analysis analyze?
Can analyze both direct and indirect relationships between 1 or more exogenous variables (IVs) and 1 or more endogenous variables (DVs)
408
What is a hierarchical linear modeling also known as?
* Multilevel linear modeling | * Linear mixed modeling
409
A hierarchical linear modeling comes from what type of analysis?
The type of analysis where you have some variables nested within other variables (students nested in a classroom when studying schools)
410
A hierarchical linear modeling, has far fewer __ and is highly ___
A hierarchical linear modeling, has far *fewer assumption and highly flexible*
411
What is the Number Needed to Treat (NNT)?
How many patients you have to provide treatment to in order to prevent one bad outcome
412
What is Control Event Rate (CER)?
Percent of patients in control group with bad outcome
413
What is Experimental Event Rate (EER)?
Percent of patients in experimental group with bad outcome
414
What is the equation for RR?
EER/CER