Psych Stats Exam #4 Flashcards

(82 cards)

1
Q

How is correlational research different from experimental?

A

1) no manipulation of the IV
2) no random assignment
3) at least 2 DV’s measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Purpose of correlational research

A

to explore association between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Correlation definition

A

the linear association between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the correlation coefficient provide?

A

An indicator of a linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Visualizing correlation

A

scatterplots: each point represents two measurements of the same person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Things to look for in a scatterplot

A
  • direction
  • scatter/dispersal
  • shape
    -outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Negative correlation

A

subjects with high scores on one variables tend to have low scores on the other variable

“when a score of X if above the mean of X, scores of Y will tend to be below the mean of Y” (and vice versa)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Positive Correlation

A

subjects with high scores on one variable tend to have high scores on the other variables (or low/low)

“when a score of X is above the mean of X, scores of Y will tend to be above the mean of Y” (and below/below)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Correlation coefficient definition (r)

A

statistic that quantifies the linear relationship between two variables
“ a measure of the tendency for paired scores to vary systematically”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the sign of r tell us?

A
  • direction NOT magnitude
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

R value ranges

A

positive +1 to negative -1
- tells us magnitude

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Perfect linear relationship

A

+1 or -1 (usually don’t exist in nature)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

R effect size guidelines

A

small: 0.1
medium: 0.3
large: 0.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

R as a descriptive statistic

A

describes effect size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

R as an inferential statistic

A

you can compare it to a critical value to find the rejection region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Null hypothesis of correlation

A

there is not a linear relationship between A and B (r = 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

r for a population

A

rho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

degrees of freedom for correlation

A

df(r) = N-2
- N = number of pairs of observations (20 data points = 10 pairs of data sets)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Example of correlation write-up

A

“there is a statistically significant negative correlation - a negative linear relationship - between number of absences and exam score r(8) = -0.85, p<0.05. The more classes students miss, the worse they tend to perform on the exam.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Correlation…

A

does not equal causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Factors that influence r

A

1) truncated range
2) outliers
3) non linear relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Truncated range

A

zooming in on one group of people (ex: just high or low scores)
- can alter correlation: misrepresenting the true strength of the existing relationship by altering sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Outliers and small sample sizes

A

can mask or exaggerate a relationship between variables
- with a small sample size, outliers heavily affect results
- extremity of outlier: very extreme outliers have larger influences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Pearson’s correlation coefficient

A

for linear relationships only
used for parametric tests (scale DV)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Examples of nonparametric inferential tests
- chi-squared tests - Mann-Whitney U test
26
Spearman's correlation
used in nonparametric tests
27
When do we use nonparametric tests?
1) When assumptions of parametric tests are not met (population skewed or non linear) 2) small sample sizes (usually under 30) 3) DV is not scale (ordinal and nominal)
28
Disadvantages of nonparametric tests
1) tend to have low statistical power (higher probability of type II error)
29
Chi-squared
- used when we only have a nominal variable "how different are the observed values from the expected values under the null hypothesis"
30
What is "O"
observed value
31
What is "E"
expected value (under the null hypothesis)
32
What is Σ
sigma: summation
33
what is χ2
chi-squared: test statistic
34
Types of Chi-Squred tests:
1) chi-squared test for goodness of fit: one nominal variable, 2+ categories - df = number of categories - 1 2) chi-squared test for independence: 2 nominal variables
35
Misuse of NHST parts
1) failure to control for bias 2) low statistical power 3) poor quality control 4) p-hacking 5) publication bias
36
What is replication crisis?
ongoing methodological crisis to replicate and reproduce psychological findings
37
reproducibility
obtaining consistent results using the same original data, methodology, and analysis
38
Replicability
obtaining consistent results across several studies that aim to answer the same question with different data
39
Open science collaboration
- attempted to reproduce the findings of 100 journal articles - 270 scientists - only 39% replicated
40
Power posing
- only self-reported feelings replicated, no physiological impact
41
Smiling make you happier
did not hold up
42
P value definition
the probability of your observed results (or results more extreme) occurring if the null is true
43
Why reliance on P-value can be misleading
1) can result in binary thinking: 0.049 is significant but 0.5 is not 2) statistical significance is not necessarily meaningful (need to look at effect size)
44
Tools to use besides P-values
1) confidence intervals: more precise and accurate measure of the sample mean as an estimate of the true population - small interval = better precision
45
Significant result but small effect size
something may be there but not meaningful
46
Not significant result but large effect size
might indicate you missed something (type II error) - might indicate low power
47
P-hacking (ways to increase power)
1) use a higher alpha 2) use a one-tailed hypothesis instead of two 3) increase sample size 4) somehow reduce variability 5) somehow make the difference between populations means bigger
48
P-hacking (definition)
the misuse of data analysis to find and report statistically significant effects - data dredging, data snooping, significance chasing
49
Ways to P-hack
1) trimming data sets (get rid of outliers, zooming in) 2) adjusting values in the data set (what you think participants "mean") 3) significance chasing: adding a few more participants at a time until the result becomes statistically significant 4) selective reporting: running many analyses but only reporting the ones that showed the desired effect
50
Debunking published research
very hard - once we see reported evidence, it is hard to change our perceptions
51
Publication bias
journals tend to publish significant results - may lead researchers to engage in shady research practices - biased in incomplete understanding: important to know what is NOT different as well - "file drawer problem"
52
Best Practices
- publish what you plan to collect and analyze to you don't adjust - people held accountable
53
Simple regression
use data to produce an equation for a straight line that captures the trend of the data - used to make predictions about Y given a particular X score
54
Multiple Regression
use data to produce an equation for a line including MANY variables - multiple predictor variables - can compare strength of different variables on how they jointly affect Y
55
IV in regression
predictor variable
56
DV in regression
outcome/criterion variable
57
Line of best fit
captures the best trend of the data
58
Simple linear regression equation
ŷ = a + bX y = predicted score on outcome a = intercept bX = slope of regression line (predicts change in Y for an increase of 1 unit in X) b = unstandardized regression coefficient - can not flip variables and get same regression
59
Ordinary least square (OLS) estimation
used to draw a line minimizing error/residuals
60
standardized beta
a 1 standard deviation increase in (IV) is related to (beta value) standard deviation increase in (DV) - used in multiple regressions
61
Write up for multiple regression (beta)
“Controlling for all other measures variables (TV exposure, age, lower grades, parent education and education aspirations) exposure to sexual content on TV is still a significant predictor of pregnancy”
62
Intercorrelated
all variables relate to one another
63
Regression can not:
1) establish temporal precedence: do not know what came first (can not determine cause and effect) 2) control for variables that aren't measured (can not measure all the variables in the world)
64
How is regression different from correlation?
Correlation: association between 2 variables Regression: prediction of DV using IV
65
When to use Mann-Whitney U
test for significant difference between two independent samples (two levels of IV, ordinal/nominal DV) - parametric partner: independent samples t-test
66
When to use Wilcoxon signed-rank T-test
Test for significant difference between two paired samples (two levels of IV, nominal/ordinal DV) -parametric partner: paired samples t-test
67
When to use Wilcoxon-Wilcox comparison test
Test for significant differences among all pairs of independent samples (three levels of IV, and ordinal/nominal DV) - parametric partner: one-way ANOVA, tukey HSD tests
68
When to use Spearman correlation coefficient
Describe the degree of correlation between two variables (nominal/ordinal DV) - parametric partner: Pearson coefficient (r)
69
When is the mean larger than the median?
Negative skewed data
70
When is mean smaller than the median?
Positive skewed data
71
Descriptive Statistics
Summarizing a distribution of data with a single number - conclusions you draw from numbers
72
Sample size and rejecting the null
Sample size increase: easier to reject the null
73
Parameters
number describing the population - muew: mean - s-hat = standed deviation
74
Statistics
number describing sample - mean = M - S = standard deviation
75
Practical use of power
- can be used to determine the sample size required to detect an effect size
76
Type I error
False alarm: you said yes but there is no effect
77
Type II error
Miss: you missed an effect that was actually there
78
Statistical Power definition
the probability that we will correctly reject the null when we should
79
What is NHST?
Null-hypothesis significance testing Testing against a null hypothesis (no significant difference) to see how odd your results are
80
Robust parametric tests
When an assumption of a parametric test is violated, but the test still operates (mostly) as intended - The tests we’ve covered this semester are robust against the assumption of normality
81
Spearman vs Pearson Correlation
Pearson: - parametric: scale DV Spearman: -non parametric: nominal/ordinal DV
82
If assumptions of a parametric test are met and you use a nonparametric test you are more likely to...
make a Type II error