Module 3 - Data, Learning, Systematic Relationships Flashcards by Avery G-

2 approaches to finding a systematic relationship

Graphical
Quantitative

How well did you know this?

Not at all

Perfectly

Use engineering judgement to ask:

Should there be a relationship?

How well did you know this?

Not at all

Perfectly

Scatterplot plotting process

Plot values of one var against another
Loo for trend in data (nature of trend = lin? exp? quad?; degree of scatter = indicates that variance isn’t cst over range)

How well did you know this?

Not at all

Perfectly

In scatterplots, look for…

Trend betw. “independent” variables and dep variables
Trend betw supposedly independent variables (indicates these quants may be correlated - codependencies, imp when mult x variables)
Correlation can produce poor model estimation results

How well did you know this?

Not at all

Perfectly

T/F: If scatterplot data arbitrarily placed on graph, the experiment was designed and thought out

F: trends and patterns indicate designed experiment

How well did you know this?

Not at all

Perfectly

T/F: Use square graphs to put model on comparable basis

How well did you know this?

Not at all

Perfectly

Covariance

Expected value of joint distribution of X and Y:

Cov(X,Y) = E{(X-mux)(Y-muy)}

How well did you know this?

Not at all

Perfectly

Sign of covariance

Indicates sign of slope of systematic LINEAR relationship

How well did you know this?

Not at all

Perfectly

T/F: Correlation and covariance are non-lin’r relationships

F: they are lin’r

How well did you know this?

Not at all

Perfectly

Correlation

Dimensionless covariance:

Corr(X,Y) = p(X,Y) = Cov(X,Y)/(sigmaX*sigmaY)

How well did you know this?

Not at all

Perfectly

Properties of correlation

Dimensionless (no units)
Range (-1 <= p(X,Y) ,= 1)
Close to -1 = strong lin’r with -ve slope
Close to 1 = strong lin’r with +ve slope

How well did you know this?

Not at all

Perfectly

T/F: Correlation gives NO info abt actual numerical value of slope

How well did you know this?

Not at all

Perfectly

If we have N pairs of observations of X and Y values… (covariance & correlation)

Sample covariance:

R = 1/(N-1) SUM (Xi - Xbar)(Yi-Ybar)

Sample correlation:

r = R/sXsY

How well did you know this?

Not at all

Perfectly

Sample correlation and covariance characteristics

Random fluctuations in data will produce random flucts in computed values
Random variables
Estimates of true covariance and correlation
Work with values as guides without computing conf intervals

How well did you know this?

Not at all

Perfectly

Don’t assume ________ equals _________.

Correlation

Causality

How well did you know this?

Not at all

Perfectly

Rule of thumb for standard normal random variable

Study These Flashcards

95% of values of Normal histogram occur within +/- 2 st. devs. of mean

Another name for bivariate distribution

Study These Flashcards

Joint distribution (betw 2 variables)

T/F: Joint distributions will have covariance matrix, and diagonals are equal to variance of X (top left) and Y (bottom right)

Study These Flashcards

What happens to bivariate distribution as correlation increases?

Study These Flashcards

distribution is stretched along X = Y line, contours more elliptical

T/F: If X and Y are not strongly correlated, the distribution will be stretched

Study These Flashcards

F: more circular, less of a trend

The multivariate Normal distr describes frequency with which vectors of values X1, Y1, X2, Y2,… Xn, Yn occur

Study These Flashcards

F: X1, X2, X3… Xn

Bivariate UNIFORM probability distribution

Study These Flashcards

Take non-0 values over certain interval: change of getting value in interval is same everywhere (contour is a single square)

Linear model

Study These Flashcards

Linear in parameter(s)

Distinguish lin’r from nonlin’r regression models

Study These Flashcards

Take first derivative wrt parameters - does derivative depend on parameters?

Fundamental framework

There is always: - Fundamental behaviour (deterministic) - Little bit of random noise

For the linear model, the observations vector/matrix/table form rows and columns are:

ROWS: # runs COLUMNS (in matrix): variables by which we're evaluating response (ie. T, V, P)

Least Squares Estimation

Minimize sum of squared prediction errors (or min square lengths betw model prediction (line) and observed value)

Residual eqn

epsilon = y - y(hat)

T/F: For lin'r model, values of slope and intercept estimate will NEVER depend on each other

F: they will often

Assumptions of LSE

1. Values of explanatory variables (x's) are known exactly 2. Model eqn form provides an adequate representation of the data ("model is correct") 3. Noise variance is cst over range of data collected 4. Noise in each obs is statistically ind from noise in other obs 5. Typically assume noise is Normally distr

Module 3 - Data, Learning, Systematic Relationships Flashcards

(30 cards)