stats final mcq Flashcards
Covariation
an unstandardized statistical measure summarizing the general pattern of association (or lack thereof) between two continuous variables
Covariation is a measure of the degree to which
two variables change together.
Positive covariation occurs when
two variables tend to increase or decrease together.
Negative covariation occurs when
one variable tends to increase while the other decreases
Covariation doesn’t necessarily mean
causation
Covariation can be influenced by
other variables
Covariation can be used to make
predictions!
Sometimes, two variables might appear to be positively or negatively correlated, but the relationship is actually
being influenced by a third variable.
Although covariation is a useful measure for understanding whether a relationship between two variables is positive, negative, or does not exist, it has some drawbacks
- Covariation does not measure the strength of the relationship between two variables
-it is not standardized, meaning its value can vary widely based on the units of the variables involved.
Correlation coefficient
a measure that quantifies the strength and direction of the linear association between two continuous variables
the most commonly employed correlation coefficient
Pearson’s r
Pearson’s r:
a statistical measure that quantifies the strength and direction of the linear relationship between two continuous variables, ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation).
Correlation specifically assesses linear relationships; non- linear relationships
may not be adequately represented by the correlation coefficient
Correlation can be heavily influenced by
outliers, which may skew the results
Outlier
a data point that significantly deviates from the other observations in a dataset, often appearing as an unusually high or low value.
We can perform hypothesis testing to determine the
statistical significance of our correlation coefficient.
in hypothesis testing Reject H0 if
the absolute value of the test statistic |t| is greater than the critical value
Fail to reject H0 if
|t| is less than or equal to the critical value
If you rejected H0, conclude that there is
a statistically significant correlation between the variables
To report the results of your test, include the
correlation coefficient r, the test statistic t, the degrees of freedom, and the p-value associated with the test.
Choose a difference in means test
when testing 2 variables, if
-the independent variable is categorical and the dependent variable is numeric
-the numeric dependent variable is normally distributed, and
-you are interested in the difference in the average values of the dependent variable across the categories of independent variable
Steps for hypothesis testing:
- State the null hypothesis.
- Set a critical value.
- Calculate a test statistic.
- Compare the test statistic to the critical value.
- Find the p-value.
- Compare the p-value of your data to the critical value’s significance
level.
Identify the critical value.
To identify the critical value for this test, we need to know the sample size (N) and the number of categories in our categorical variable, which we can then use to calculate the degrees of freedom (df)
We calculate the sample size (N) as
the number of observations in our dataset for our independent and dependent variables