quiz 5 (pt. 2) Flashcards

(20 cards)

1
Q

What is the simplest measure of the relationship between two scalar variables?

A

correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why would we want to compare two or more scalar variables in research?

A
  • Examine potential patterns or relationships
  • Identify shared variability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does Pearson’s coefficient of correlation (r) measure?

A

Strength and direction of the linear relationship between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the main limitation of Pearson’s correlation coefficient?

A
  1. Parametric
  2. Normally distributed
  3. Sensitive to outliers
  4. Assumes a linear relationships
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between correlation and causation?

A
  1. Correlation: refers to a statistical relationship between two variables, but it does not imply that one variable causes the other.
  2. Causation: suggest that one variable directly influences the other.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an example of a situation where correlation does not imply causation?

A

Number of pirates and global temperatures. As the number of pirates decreased over time, global temperature increased, but this doesn’t mean pirates caused global warming; it’s a spurious correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What type of correlation would you expect between height and weight in a population?

A

Positive correlation; both tend to increase together in population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Spearman’s rank correlation, and when is it used?

A

Non-parametric version of correlation that works by ranking the data and then calculating the difference between ranks. It is used when the data doesn’t meet the assumptions of Pearson’s correlation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens to the value of r if the axes are switched in a correlation plot?

A

If the axes are switched in a correlation plot, the value of r remains the same because correlation is symmetrical. The relationship between the variables is the same, regardless of which variable is placed on the x-axis or y-axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an important consideration when interpreting significant correlations in observational studies?

A

In observational studies, even if a correlation is significant, we cannot assume that one variable causes the other. It’s important to consider potential confounding variables and the possibility of reverse causality or other underlying factors that may influence the observed relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is linear regression used for in research?

A

Linear regression is used to build a mathematical model that describes the relationship between one or more predictor variables (independent variables) and a response variable (dependent variable). It helps to predict the value of the response variable based on the predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is bivariate linear regression?

A

Bivariate linear regression involves using a single predictor variable and an intercept to explain or predict the variation in a response variable. For example, height might be used to predict weight using a straight-line model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does multiple linear regression differ from bivariate linear regression?

A

Multiple linear regression involves two or more predictor variables. It accounts for the combined effect of multiple predictors, whereas bivariate linear regression uses only one predictor to explain the response variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the formula y = mx + b represent in linear regression?

A

The formula y = mx + b represents the linear equation of a line, where y is the predicted response, m is the slope (indicating the rate of change), x is the predictor variable, and b is the y-intercept (the value of y when x is zero).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the purpose of using Ordinary Least Squares (OLS) in regression?

A

Ordinary Least Squares (OLS) is used to minimize the sum of the squared differences between the observed data points and the values predicted by the model. This helps to find the line that best fits the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is R², and why is it important in linear regression?

A

R² is the coefficient of determination and measures the proportion of variation in the response variable that can be explained by the predictors. A higher R² indicates a better fit of the model to the data

17
Q

What is the role of ANOVA in regression analysis?

A

ANOVA (Analysis of Variance) is used to compare the variance explained by the regression model with the variance of the data around the mean. It helps to determine if the model significantly improves the prediction compared to a simple model (null model)

18
Q

What is the Akaike Information Criterion (AIC), and how is it used in model selection?

A

The Akaike Information Criterion (AIC) is a measure used to compare different regression models. It considers both the goodness of fit and the number of parameters in the model. A lower AIC indicates a better model, balancing fit and simplicity

19
Q

Why is it important to avoid over-parameterization in regression models?

A

Over-parameterization occurs when too many predictor variables are included in a model, leading to overfitting. This can result in a model that fits the sample data well but performs poorly on new or untested data. It’s important to include only significant predictors to improve generalizability.

20
Q

How can linear regression be applied to predict heart disease-related outcomes, like hypertension?

A

Linear regression can be used to predict heart disease-related outcomes, such as hypertension, by modeling the relationship between predictors like BMI, age, and cholesterol levels. A model might use BMI as a predictor for systolic blood pressure, allowing clinicians to estimate a patient’s hypertension risk.