Logistisk regression Flashcards

1
Q

Describe the DV in Logistic regression

A

It is BINARY! It can take the value 0 or 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does logit stand for?

A

Log of odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What do you get if you plot the values of x against the probability of y becoming 1?

A

You will get the sigmoid function (s-curve)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain both coefficients in odds ratio above and below 1.

A

Increase in “x” by one unit is increasing/decreasing the odds of y by ***…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain what maximum likelihood estimation is

A

Model estimation method: - Calculating the betas based on the observed values of x and y, so the predicted value of y gets as close to those that are actually occurring in the sample. - Maximum likelihood estimation → used to fit and optimize our sigmoid curve (linear regression would have used OLS, that tries to minimize residuals). - What SPSS does when it uses maximum likelihood - draws a lot of different possible s-curves based on the observed values of X and Y and then calculates the maximum likelihood value so predicted values Y in the sample gets so close to those as possible → So SPSS is drawing a lot of different curves and we choose the one with the highest maximum likelihood value = our fitted model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do the values between the observed and predicted values tell us?

A

it tells us the misfit in our model (Chi-square likelihood ratio test - Omnibus test) Use log-likelihood - express the difference between the observed and predicted outcome = The lower the number/ value, the better the fit - uses chi-square value which follows a chi-square distribution and the DF to check the significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the assumptions of logistic regression?

A
  1. Linearity in the logit → we have a s-curve this assumption is therefore different from linear regression  Logistic regression assumes a linear relationship between continuous predictors and the logit transform of the DV o We check for linearity between logit the of outcome variable and continuous predictor We compute a new variable in SPSS  we take the LN of the continuous variable  we run the logistic regression with the logit of continuous variable and multiple it with itself without the ln (interaction effect)  Looking at the significance-level of the interaction effect - if significant we would not be able to assume linearity  We would need to eliminate or include is squared format or something else 2. Random sampling → depends on data gathering 3. No perfect collinearity (Absence of multicollinearity) → None of the IV’s must be in a constant on the exact linear relationship. Is sensitive to extremely high correlations among predictor variables, signaled by exceedingly large standard errors for parameter estimates  checked through multiple linear regression where we look at the collinearity diagnostics, specifically the eigenvalues. o Eigenvalues: If several eigenvalues are close to zero then the predictors are highly inter correlated. o Condition index: Values larger than 15 would imply problems where values larger than 30 would imply serious issues. o Avoid multicollinearity as well. Have VIF in coefficient table and if we have values between 1-10 then we don’t have multicollinearity - if we did have it, we should consider removing those variables o VIF = How much of the variance in the IV can be explained by the other IV’s o From lecture: “No perfect collinearity” is one of the regression assumptions and it essentially means that you cannot add variables that are perfectly correlated with each other, so there is an exact linear relationship between them (so no age in years and age in months in the same regression).  Multicollinearity is a level bellow perfect collinearity, meaning there is a very high correlation (but not perfect) between variables. Multicollinearity is not ideal for getting a reliable estimate of the coefficients and signifi-cance statistics for the affected variables. Therefore if you notice a very high correlation between independent variables that are supposed to go into the same regression, you should be alerted to the possibility of multicollinearity. You can do further checks- such as variance inflation factor.  Multicollinearity is not a violation of a core assumption and it is causing problems only for the affected variables. Therefore if you have two IVs that are affected (there is too high correlation between them), nothing happens for the rest of the variables. If these two variables are control variables, and you just want to make sure you control for all systematic variance in the DV (remember max, min, con), then you could simply leave them in the regression and note that you cannot interpret the coefficients, because they are not reliable. It is a much bigger problems if these were IVs of interest, you have hypotheses around these variables. You essentially cannot test your hypotheses, because you cannot separate between the two variables. Multicollinearity could be a sign of a deeper underlying problem. If you have two scales that are highly correlated, it could be that they were poorly developed in the first place and you have problems with discriminant validity (check the CFA lecture).  To summarize- perfect collinearity is a deadly problem for a regression, multicollinearity is a problem but not a deadly one, depends on which variables are affected. 4 and 5. Independence of errors → no relation between IVs and error terms (error term value created when the model does not fully represent the relationship between the DV and the IVs). If we had a relation among the IV’s and the error term, we would have overdispersion where the results would seem better/ more significant than they were. o Make sure to include all the IV’s that are relevant for the model o Too big test statistics- everything will be significant o Confidence intervals will be too small o If problems: Rescale the Wald standard errors for each parameter by a variance inflation factor. This is done by multiplying the calculated standard error by (x2/df) where x2 and df are from the deviance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Hosmer Lemeshow (Judging the fit of the model)

A

Compares the deviance of the baseline-model and the deviance of the new model and divide it by the deviance of the baseline-model.

If it is insignificant then it indicates a good fit = larger than p-value 0,05

Besides Hosmer we also have Cox shell and Nagelkerke - We would like both of these values to increase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain what is going on here

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain the logic behind log-likelihood, deviance and omnibus test

A
  • Log likelihood and Deviance
    • The fit of the model is therefore judged by chi-square likelihood ratio test
    • Log likelihood is similar to sum of squares in a normal regression – express difference between probabilities of observed and predicted outcome- the lower the value the better the fit
    • Deviance = -2 x log-likelihood- has chi-squared distribution
    • Delta chi square = (-2LL(old))-(-2LL(new))
    • delta df= k(old)-k(new)
  • Log likelihood and deviance → use deviance to judge the fit and it is a measure of misfit (how much of the variance is unexplained). Higher deviance means less accuracy.
  • Using a chi-square test – And testing the deviance between the old and the new model Omnibus test
    • Omnibus test is used for testing whether variables added last into the model help the predictive power of our model
    • It helps us to compare between model, and we hope that we see significant improvements
  • We need to choose a p-value  We also have a CI of 95% and in order to be significant p-value needs to be below 0,05  Basically is our chi-square value big enough to label it as a significant change  If so then we can reject H0, and say that we have added a significant improvement of the model
    • Interpretation of model summary: the real value of R^2 lays between Cox Snell and Nagelkerke values → when adding new values we want these values to get larger = better model fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly