Logistic Regression Flashcards

1
Q

What is logistic regression used for?

A

Logistic regression can be used to analyze binary response as well as ordinal response data.

Binary:
- The response, Y, of a subject can take one of two possible values, denoted by 1 and 2 (for example, Y=1 if a disease is present; otherwise, Y=2).

Ordinal:
- The response, Y, of a subject can take one of m ordinal values, denoted by 1; 2;…;m

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does infinite parameters mean?

A

The term infinite parameters refer to the situation when the likelihood equation does not have a finite solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Complete Separation?

A

There is a complete separation of data points if there exists a vector b that correctly allocates all observations to their response groups.

The maximum likelihood estimates does not exists.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Quasi-complete Separation?

A

With equality holds for at least one subject in each response group, there is a quasi-complete separation.

The maximum likelihood estimates does not exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Overlap?

A

If neither complete nor quasi-complete separation exists in the sample points, there is an overlap of sample points.

The maximum likelihood estimate exists and is unique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When is Complete separation and quasi-complete separation normally a problem?

A

Complete separation and quasi-complete separation are problems typical for small sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is logistic regression?

A

Logistic regression allows one to predict a discrete outcome such as group membership from a set of variables that may be continuous, discrete, dichotomous, or a mix.
Logistic regression emphasizes the probability of a particular outcome for each case.
The procedure for estimating coefficients is maximum likelihood, and the goal is to find the best linear combination of predictors to maximize the likelihood of obtaining the observed outcome frequencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What assumptions are not relevant in logistic regression?

A

in logistic regression, the predictors do not have to be normally distributed, linearly related to the DV, or of equal variance within each group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Mention the different types of logistic regression.

A

there are three major types of logistic regression: direct (standard), sequential, and statistical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain direct (standard) logistic regression

A

all predictors enter the equation simultaneously (as long as tolerance is not violated).

This is the method of choice if there are no specific hypotheses about the order or importance of predictor variables.

The method allows evaluation of the contribution made by each predictor over and above that of the other predictors.

This method has the difficulties with interpretation when predictors are correlated. A predictor that is highly correlated with the outcome by itself may show little predictive capability in the presence of the other predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain sequential logistic regression.

A

The researcher specifies the order of entry of predictors into the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain statistical logistic regression.

A

Inclusion and removal of predictors from the equation are based solely on statistical criteria.
When statistical analyses are used, it is very easy to misinterpret the exclusion of a predictor; the predictor may be very highly correlated with the outcome but not included in the equation because it was “bumped” out by another predictor or by a combination of predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you Interpret Coefficients Using Odds?

A

The odds ratio is the change in odds of being in one of the categories of outcome when the value of a predictor increases by one unit.

The coefficients, B, for the predictors are the natural logs of the odds ratios; odds ratio = e^B .
Therefore, a change of one unit on the part of a predictor multiples the odds by e^B .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the dependent variable expressed as in Log. regression?

A

the natural log of the probability of being in one group (0) divided by the probability of being in the other group (1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the coefficient stand for in log. reg.?

A

The change in logit (log of odds) of the outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How many observations should at least be in each group?

A

In each “group” there should be no less than 1 observation and 20% of expected groups should be larger than 5

If too little observations in groups, then the statistical tests have no power

17
Q

How is the significance assessed in log. reg.?

A

Use Z-statistic and Wald test

18
Q

How is beta interpreted in log. reg.?

A

Exponential of beta coefficient= odds ratio- change in odds resulting from a unit of change in the predictor

So, we interpret how did the odds of Y=1 change after increase in IV

19
Q

How is the fit of the model evaluated in log. reg.?

A

The fit of the model is judged by chi-square likelihood ratio test. It express difference between probabilities of observed and predicted outcome- the lower the value the better the fit.

Also R^2 is used - want as high values as possible:
- Hosmer Lewershow
- Cox Snell
- Nagelkerke

20
Q

What are the assumptions in Log reg.?

A
  • independence of error
  • Linearity in the Logit
  • random sampling
  • no perfect collinearity/multicollinearity
  • expected frequencies
21
Q

What are some potential problems in log. reg.?

A
  • Incomplete information from the predictore (not enough observations in each subgroup)
  • Complete seperation
22
Q

What does odds ratio really stand for?

A

Odds: probability of an event occurring divided by probability of the event not occurring

Odds ratio: odds after predictor has changed divided by odds before that