Lecture 8 Flashcards

1
Q

What is correlation?

A

A measure of the strength of the relationship or association between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we know if a statistical association between X and Y exists?

A

Variability in one variable leads to (affects, causes, or overlaps with) variability in the second variable.
Therefore, one variable can be used to explain SOME portion of the variability of the other variable.
In other words, having information on one variable decreases the variability of the other variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a more inefficient way of predicting scores?

A

The predicted score for the ith person is the mean.

There is a lot of uncertainty with this estimation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If a relationship between X and Y exists…

A

we can use the information about X to decrease the uncertainty in our prediction of Yhati.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we use the information about X to decrease the uncertainty in our prediction of Ycarroti?

A

First, we group the Y scores according to the X values.

Next, we could use the mean of the raw scores (Y) each treatment group (X) to establish Yia.

i.e. Use Ybara to establish an estimate of the raw score of a person in that group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Example
If the average range (variability) in Y given X equals 6, how many points of Y’s variability is NOT attributable to x? Why?

A

6!

i. e. it’s residual
- if 6 of the 16 points of variability in Y is not due to X, then 10 points in the variabilty in Y MUST be due to X.

i.e. 16-6=10
i.e. variability in Y that is attributable to X = total variability (total variabilty in Y not attributable to X).
Thus, when trying to predice Yi (an individual raw score), if we use our knowledge of X we reduce variability by 10 points.
i.e. we reduce our uncertainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give the correlation ratio

Relate this to the example

A

η² (this is the greek letter eta) = variability in Y common to X / total variability in Y
Variability in Y (sums of squares between) and “common to” means attributable to

In our example:
10/16 = .63 –> Had to use variance instead of range as our measure of variability, η² = .74

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Actually, η² = SSbetween/SStotal

Describe!

A
  • Is measure of the strength of the relationship between X & Y.
  • Often used after F tests to determine practical significance –> i.e. is a measure of effect size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give the limitations of η² as a measure of effect size:

A
  1. Relaibilty of variables restricts the magnitude of η²;
  2. The more homogenous the population, the smaller η² –> restriction of range;
  3. The magnitude of η² is affected by the number of levels of X;
  4. Does not indicate the form of the relationship;
  5. It is UNSTABLE –> it varies a lot from sample to sample –> therefore descriptive stat only, not inferential.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Due to the limitations of η², what do we focus on?

A

The Pearson Product-Moment Correlation Coefficient (r).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Pearson Product-Moment Correlation Coefficient (r)?

A

Measures the degree direction of a linear relationship or association of two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Give the conceptual formula for r

A

r = degree to which X & Y vary together/degree to which X & Y vary seperately
i.e.
r = Covariability of X & Y/Variability of X & Y seperately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what does it mean if r (correlation) is big?

A

It means that most of the variabilty of X & Y is due to how X & Y co-vary.

  • If we have a perfect linear relationship, every cange in X is accomplished by a corresponding change in Y (and vice versa)
  • If r = 0 (i.e. no relationship), a change in X does not correspond to a predictable change in Y.
    i. e. they don’t co-vary

Again: rxy = COVxy/σhatxσhaty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Close up of the numerator of (COVxy(σhatxy):

give equation in words

A

Sum of the cross-products (SP - Sum of products of deviations) of the deviation scores divided by n-1

In other words, COVxy = average sum of the cross-products of the deviation scores of the 2 variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Give the conceptual formula for Sum of products

A

Sum(Xi-Xbar)(Yi-Ybar) = SumXY - SumXSumY/N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
Note: Conceptual:
SS = (Y - Ybar)(Y - Ybar)
SP = Sum(X - Xbar)(Y - Ybar)
Computational:
SS = SumYY - SumYSumY/N
SP = SumXY - SumXSumY/N

Sum of squares (SS) uses squares and sum of products usees cross products

A
Note: Conceptual:
SS = (Y - Ybar)(Y - Ybar)
SP = Sum(X - Xbar)(Y - Ybar)
Computational:
SS = SumYY - SumYSumY/N
SP = SumXY - SumXSumY/N

Sum of squares (SS) uses squares and sum of products usees cross products

17
Q

When will we get large positive values for SP?

A
  • Largest positive (Xi - Xbar) paired with largest positive (Yi - Ybar)
  • Second largest positive (Xi - Xbar) <– deviation on Y
  • Largest negative (Xi - Xbar) is paired with largest negative (Yi - Ybar)
  • Second largest negative (Xi - Xbar) is paired wit second largest (Yi - Ybar)
    etc. ..
18
Q

Seeing from the largest possible values for SP, what does Covariance of X and Y (COVxy) measure?

A

The strength and direction of the relationship, independent of the number of scores (because we divide by N-1 which is our degrees of freedom).

19
Q

Why can’t we compare covariances from different scales?

A

Covariance of X and Y depends on the variability of X and Y, which depends on the measurement scales, therefore can’t compare covarianes from different scales. Also, differences in Covariance of X and Y may be due to differences in σcarrot. Therefore, X and Y must be correccted for differences in σcarrot.

20
Q

How do we correct for differences in σcarrot?

A

Accomplished by dividing COVxy (covariance of X and Y) by the product of σcarrotX and σcarrotY
i.e. σcarrotXσcarrotY (the denominator)

21
Q

Why does dividing COVxy by σcarrotXσcarrot Y correct for differences in σcarrot for r?

A

Because we divide by σcarrotX and σcarrotY, r is independent of the dispersions of the two variables and is a dimensionless index of a linear relationship –> i.e. is not dependent on the unit of measurement or variability of either X or Y.

22
Q

Z = X - Xbar/σ oh! r = Σ|(X-Xbar)/σcarrotX|(Y - Ybar)/σcarrotY|(numerator divided by n-1).
i.e. Σ2X2Y/n-1

A

Z = X - Xbar/σ oh! r = Σ|(X-Xbar)/σcarrotX|(Y - Ybar)/σcarrotY|(numerator divided by n-1).
i.e. Σ2X2Y/n-1

23
Q

What does r equal if X and Y are z scores?

A

r = σXY

24
Q

What does the equation r = Σ|(X-Xbar)/σcarrotX|(Y - Ybar)/σcarrotY|(numerator divided by n-1) mean?

A

That r is independent of the number of scores and variability.

25
Q

Describe how transformations affect r

A

r is unaffected by transformations.
Thus, r is identical if computed between:
- raw scores
- z scores
- if added or subtracted a constant to each raw score
- If multiplied each raw score by a constant
- if dividing each raw score by a constant
(These are all linear transformations

26
Q

what happens when we square r?

A

It results in a ratio of variance
i.e. r²xy = shared variance/total variance aka common variance/total variance and sooo many others (review briefly in notes)

27
Q

What do we get when we multiply r²xy by 100?

A

The percentage of σ² in Y accounted for b (knowing) X.

28
Q

What do we get when we minus 1 from r²xy?

A

1 - r²xy = proportion of σ² in Y NOT accounted for by X.

29
Q

What is r² another measure of?

A

effect size.

30
Q

Why is r² another measure of effect size?

A

It is the proportion of variance in Y attributable to X.
E.g. say r = .2, so r² = .04. Who cares? This is too small. Say Y is memory, and X is coffee consumption. So we understand 4% of memory by knowing someone’s coffee consumption.

31
Q

What can we do if there is a statistical association between two variables?

A

We can use this information to make predictions about one variable, or test hypotheses about group difference.
This brings us to Regression.

32
Q

Assumption: Yia = Y.. + alphaa + eia is true.

A

i.e. this linear model is correct.

33
Q

Is it an assumption of ANOVA that X and Y are linearly related?

A

“NO!” underliiiiined (Two points on final).