Regression Flashcards

1
Q

What defines the prediction interval?

A

Interval for which we have (f.i. 95%) confidence that a new data is observed in this interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What defines the confidence interval?

A

It defines the margins of confidence of the predicted mean of our model, of which we are 95% certain the mean lays in there.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference with paired and unpaired group means?

A

With paired, the groups can be compared because they share study parameters. (when researching recovery time; you test the same students on their scores begin and end of the year. ) With unpaired, the two studies seem similar, but have no overlap: (when researching recovery time; different patients are tested on different treatments, in different countries)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Analysis of Variance? (ANOVA)

A

When comparing multiple groups of a study, ANOVA is a tool that will tell us how much variance is explained for (and) by which factors of the response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What value of ANOVA indicates a high influence of a factor on the response?

A

In the ANOVA table, a high Sum.Sq. value and F-value will indicate this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When is ANOVA usefull?

A

When investigating the influence magnitude of factors in a multi-variable prediction model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In model selection, what is forward selection?

A

First, fit the null model containg only the intercept.

The fit p seperate models by adding each of the predictors individually.

Keep the model with the lowest RSS (or highest R2).

Repeat until some requirement is met.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In model selection, what is backward selection?

A

Fit the maximal model with all p predictors.

Remove the predictors that meet a certain requirement. (f.i. that have a p-value higher than the significance level)

Fit the new (reduced) model and continue until some model condition is met.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the generalized linear model (GLM)?

A

It is a model type that does not asume the response to be Gaussian. Ordinal (aka categorial) responses are f.i. not gaussian.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What two components does the GLM introduce and how are they projected in ‘formula form’?

A

The link function g(Y) and Distribution D.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the link function look and to what domain does it map itself?

A

it maps the response domain [0, 1] to a domain to [-inf, inf]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the link space?

A

In Logistic Regression (so in GLM) it is the [-inf, inf] domain space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the response space?

A

In Logistic Regression (so in GLM) it is the [0, 1] domain space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Logistic Model?

A

It is the derived from the GLM model where g(Y) = logit(Y). So basically a log-transformatino on the response of a (generalised) linear model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In logistic regression, in what space do we estimate our parameters?

A

We build a linear model in the link space, such that we can transform the logit function in the link space back into the response space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are Generalized Additive Models?

A

They are similar to GLM’s, apart from the fact that each factor is now also modeled with nonlinearities. So each factor is asumed as a function of that factor.