fuck Flashcards
(17 cards)
What is correlation and regression
Looks for relationships
looks at similaritires between samples instead of divergences
see whether one sample varies alongside the variations of the other sample (covariance - how 2 variables shift together)
What is covariance
strong covariance = bigger similarity in movement
sum between the products of the individual deviations between two variables
measurement of how each variable deviates together
does not take into consideration error - identify proportion between the covariance and the individual deviations of the variables
How to control for error
Coefficient called Pearson’s r
What is correlation
statistical technique used ot measure and desicribe a relationship between 2 variables
2 variables are observed as they exist naturally; no attempt to control or manipulate variables
Ex. Height and Intelligence, socioeconomic status and length of marriage
What can corelation be used for
Prediction - based on trends, and not based on causality
Validity - measurement and testing; Are scales/tools able to measure the right concepts for a study
reliability - are scales/tools used for measuring able to measure consistently
Theory verification - are theories able to correctly proide explantopns
What are the characteristics for correlation
- Shape - Straight line since measures linear relationship only
- Direction - either positive (X increases Y also increases) or negative (X increases Y decreases)
- Strength - measures the degree to which points fit the straight line; if all points fall exactly on line, a perfect relationship exists. The more the scatterplot resembles line, the stronger the correlation.
1.00 = perfect relationship while 0.00 = no relationship
- Significance - does the observed correlation between X and Y really exist in the population and is not due to chance or error
Use Table F, Pearson r for alpha at .05 or .01 with df = 1-90
only when result is significant we interpret but if not significant do not interpret
What is the values for strength in correlation
.00 - .29 - weak
.30 - .69 - moderate
.70 - 1.00 strong
What are the strengths of correlation
Describes the relationhip between only 2 variables
naturabl observation - no interference or manipulation
accurately reflects the natural events being examined
What are the weaknesses of correlation
Third variable may interfere with the two variables and can be
responsible for the observed relation
Does not determine cause or effect- Non-directional relationship = none of the variables can claim
precedence
Does not produce a clear and unambiguous explanation for the
relationship
What is Pearson’s R
Pearson Product-Moment Correlation Coefficent
The most common correlation
Measures the degree of straight-line relationship bewteen two variables at a time
What are the assumtpions and requirments of Pearson’s R
A straight line or linear relationship
Both X and Y are variables that must be measured at the interval level
sample members must be drawn from a random sample
Both X and Y variables must be normally distributed
Sample size must be at least 30 to disregard normality violations
How do we test for Correlation
- State the hypothesis
Ho: There is no correlation between
X and Y
Ha: There is a correlation between X
and Y - Set the Level of Significance at .05
- Compute
- Interpret - use table F
df = N-2 where N is the number of paired scores
Robt > Rcrit
How do we interpret the score that we computed using Pearson’s R
.00 - .29 = Weak
.30 - .69 = Moderate
.70 - 1.00 = Strong
What are the final notes on correlation
Correlation score should not be interpreted as a proportion
Looks at the strength and direction of correlation value
Does not imply causation -existence of correlation does not imply the existence of a causal link bewteen two variables
Describes relationship between two variables and does not explain why they are related
What are the other final notes on correlation (Factors that can affect correlation)
Possible that correlation is due to a common third variable (causing the 2 variables)
Correlation is affected by the restriction of the range - If only a restricted, more homogeneous and selective subest of entire range is included, expect correlation to be weaker
Must ensure that entire range (or the widest range possible) of X and Y values are sampled
Getting only a subset of the entire range of X and Y value will weaken the correlation
How to determine the accurcy of our prediction
By using r squared which is the coefficient of determination
What does the coefficient of determination do
Squaring r measures the proportion of percentage variability in the DV determined by the IV
Portion obtained will tell the protion of DV that is predicted by the IV
-0.50 means that one variable is partially associated, but the variability portion is only r squared = 0.25/25% of the total variability
There are other (extraneous) variables that are affecting the relationship