Lecture 6, 7 and 8 Flashcards

Question 1

Q

What does correlation coefficient ’r’ explain? Positive and negative correlation?

Answer

A

When r is either close to 1 or -1 there is a strong correlation between to variables.

Question 2

Q

Why do we use OLS linear regression models?

Answer

A

To model the linear relationship between one or more independent variables.

Question 3

Q

What is the difference between a correlation matrix and a linear regression model?

Answer

A

A correlation matrix quantifies associations between pairs of variables but does not provide a predictive model, while a linear regression model aims to model and quantify the relationship between independent and dependent variables to predict outcomes and understand variable associations more deeply.

Question 4

Q

In the linear regression equation 𝑌 = 𝛼 + 𝛽 𝑋 + 𝜖, what are variables X and Y?

Answer

A

Y is the outcome and X is what you equation grows with.

Question 5

Q

In the linear regression equation 𝑌 = 𝛼 + 𝛽 𝑋 + 𝜖, what are parameters alpha and beta?

Answer

A

Alpha is you value when x is 0
Beta is what your linear equation grows with every time x expanse.

Question 6

Q

. In the linear regression model, why do we require variation in values of X? What happens
if there is no variation in values of X?

Answer

A

if there is no variation in values of x the strength or direction in the linear regression model and there will not be a slope coefficient.

Question 7

Q

What are dummy or indicator variables? Provide examples.

Answer

A

represent categorical data or to convert categorical data into a format that can be used in quantitative models.
use example in your own project

Question 8

Q

How do we include nominal independent variables in regression analysis?

Answer

A

We have done that in our project reflect to that.

Question 9

Q

. What is an outlier and why can they be a problem in your analysis? Provide 1 or 2 ways
in dealing with outliers.

Answer

A

An outlier is a observation or data point, which is significantly different from the rest of the dataset.
- Identify and examine: make a visualizations to identified the outliers.
- You can replace the outlier but I will affect your result.

Question 10

Q

In our linear regression model output, an R-squared 𝑅2 is reported, what does it mean and
what do we use it for?

Answer

A

R^2 explain how much variability in the dependent variable. If it close to 0 there is non variability in the dependent variable but if is 1 there is much variability in the dependent variable.

Question 11

Q

Can we assume that the regression model with the highest number of predictors is the best
model? Why or why not.

Answer

A

More predictors just cause noise in the dataset, where R2 is a good indicator to say something about a regression model.

Question 12

Q

The slope parameter 𝛽
̂
of a simple linear regression is +2. Interpret this number.

Answer

A

If beta have a +2 the regression line is positive

Question 13

Q

. The slope parameter 𝛽
̂
of a simple linear regression is -2. Interpret this number.

Answer

A

But if the linear regression model have -2 beta is have a negative line.

Question 14

Q

An analysis relates the age of used cars (in years) to their price (in USD), using data on a
specific type of car. In a linear regression of the price on age, the slope parameter 𝛽
̂
is -
700. Interpret the coefficient.

Answer

A

it means that every year the car is losing 700 in value.

Question 15

Q

An analysis relates the size of apartments (in square meters) to their price (in USD), using
data from one city. In a linear regression of the price on size, the slope parameter is +600.
Interpret the coefficient.

Answer

A

It means every time the apartments get a square meter bigger it gross 600 in price.

Question 16

Q

What is the advantage of using multiple regression model as compared to a simple
regression model?

Answer

Study These Flashcards

A

one of the advantage is that you can have multi independent variables and 1 dependent variable in multiple regression where in simple regression you can only have 1 independent variables and 1 dependent variable

Question 17

Q

In regression diagnostics, what does the following figure tell us: Histogram of residuals.

Answer

Study These Flashcards

A

it can tells us which form the dataset have. It is normal distribution? Does it have skewness? Is it bimodal(2 peaks) or unimodal(1 peak).

Question 18

Q

In regression diagnostics, what does the following figure tell us: Plot of residuals versus
predicted values of y.

Answer

Study These Flashcards

A

when looking on a scatterplot, Patterns in the plot can reveal important information about the relationships between the dependent variable and the independent variables.

Question 19

Q

Explain Heteroskedasticity.

Answer

Study These Flashcards

A

is not constant across different levels of the independent variables. In simpler terms, it means that the spread or dispersion of the residuals varies systematically as you move along the range of the predictors.

Question 20

Q

What is multicollinearity? How can we address this issue?

Answer

Study These Flashcards

A

when there is a strong linear relationship between two or more predictors.
-interpretation difficulty

Question 21

Q

How do you interpret regression coefficients in a logistic regression?

Answer

Study These Flashcards

A

it is different from linear regression because when we interpret we have to binary outcome success/failure or yes/no.

Question 22

Q

Provide an example of experimental design – what is the outcome of interest, explain
treatment and control groups in the study.

Answer

Study These Flashcards

A

Experimental Design Example: Investigating the Impact of a New Drug on Blood Pressure

Outcome of Interest: The outcome of interest in this study is the blood pressure of individuals. Specifically, researchers want to determine if a new drug has an effect on blood pressure.

Treatment Group: In the experimental group (treatment group), participants are administered the new drug. They receive a specific dosage of the drug under controlled conditions.

Control Group: The control group consists of individuals who do not receive the new drug. Instead, they might receive a placebo (an inactive substance) or no treatment at all. The purpose of the control group is to provide a baseline for comparison. By comparing the blood pressure changes in the treatment group with those in the control group, researchers can assess whether any observed differences are due to the new drug or other factors

Lecture 6, 7 and 8 Flashcards

(22 cards)