Week 5 Flashcards

(26 cards)

1
Q

WHat kinds of questions can regression anser?

A

How do systems work? Value of a home run, effect of econmic factors on pres election, ipact of education on income, key factors in car purchasing
What will happen in the future? How tall a child will be, oil price prediction, remaining lifetime of person purchasing life insurance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Simple regression equation

A

y = a0 + a1x1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to measure the quality of a line’s fit

A

Minimize sum of squared errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Most basic measure of quality

A

maximum liklihood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MLE

A

set of parameters that minimizes sum of squared errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can likihood be used

A

to compare two models performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Commonly used functions for comparing models

A

AIC(Akaike Information Criterion)
BIC(Bayesian information criterion)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What values to prefer for AIC and BIC

A

AIC(Prefer smaller numbers)
BIC(

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Making AIC smaller encourages

A

fewer parameters k and higher liklihood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

AIC has nice properties if

A

there are infinitely many data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to deal with AIC with smaller datasets

A

AIC c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Use BIC when there is a lot more data than parameters. TF

A

T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Rule of thumb for BIC

A

BIC1 - BIC2 > 10: smaller BIC model is likely better
6 < |BIC1 - BIC2} < 10: Smallery BIC model is likely better
2 < |BIC1 - BIC2| < 6: smallery BIC model is somewhat likely better
0 < |BIC1 - BIC2| < 2: SMallery BIC model is slightly likely better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

AIC is what kind of point of view

A

Frequentist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Regression can’t answer what kind of problem

A

prescriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

causation

A

one thing causes another thing

17
Q

correlation

A

two things tend to happen or not happen together. Neither of the might cause the other.

18
Q

p value

A

estimates the probability that the coefficient might be 0:

If p value > .05, remove corresponding ttribute from model

19
Q

Other p value thresholds

A

higher thresholds: More factors can be included with possibility of including irrelevant factor

lower thresholds: less factors can be included with possibility of leaving out a relevant factor

20
Q

Warnings about p values

A

With large amounts of data, p values get small even when attributes are not related to the response

P values are only probabilities even when meaningful

21
Q

COnfidence interval

A

WHere the coefficient probaby lies and how close it is to 0

22
Q

T statistic

A

the coefficient divided by its standard error
related to p value

23
Q

Coefficient

A

When multiplied by the attribute value doesn’t make much difference even if very low p value

24
Q

R squared

A

estimate how much variability your model accounts for.

If R squared value is 59% when it accounts for 59% of variability int he data and the remaining 41% is randomness or other factors

25
Adjusted R squared
adjusts for the number of attributes used
26