Summa Week 6 Flashcards

(118 cards)

1
Q

What is the GLM?

A

the GLM or the General LInear MOdel is the conceptual framework unifying a large set of statistical methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What can GLM answer?

A

almost any question if the DV is continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What must the DV be for the GLM to likely be able to answer it?

A

if it is continuous, and therefore NOT categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are familiar statistics within the GLM?

A

t-tests
correlation
ANOVA
regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is bivariate correlation?

A

an association between scores on 2 random variables

e.g. hrs spent on Twitter and # followers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the relation in correlation?

A

a straight line, or “linear correlation” or “linear regression”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is assumed about a correlation?

A

it can be a straight line if a linear correlation or regression, or a curved one referred to as a non-linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how are correlations detected by the eye?

A

usually by a scatterplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are scatterplots?

A

a graph that shows the degree and pattern of the relationship between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is on the horizontal axis of a scatterplot?

A

a variable that does the predicting (likely arbitrary, and an IV)
e.g. hours of studying, income

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is on the vertical axis of a scatterplot?

A

usually the variable that is predicted (likely a DV)

e.g. grades, happiness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the shape of a positive bivariate distribution or a scatterplot?

A

a line from the bottom left to the top right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the shape of a negative bivariate distribution or a scatterplot?

A

line from top left to bottom right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the shape of no correlation in a scatterplot?

A

there are datapoints scattered all ove r the plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the shape of a curvilinear distribution or a scatterplot?

A

a bell curve on a scatterplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the covariance?

A

the correlation coefficient which is a bivariate statistic that measures the degree of linear association between two quantitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is covariance?

A

a number that reflets the degree to which two variables vary together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What represents the correlation coefficient?

A

italicized r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does italicized r indicate?

A

the precise degree of linear correlation between two variables, or the correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the range of italicized r?

A

the range for a correlation coefficient is between -1 and 1 [-1,1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the formula for covariance?

A

COVxy = sum of (stat1 - mean1) (stat2 - mean 2)/ (N - 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the formula for a correlation coefficient?

A

r = COVxy / SxSy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Who developed the covariance, or correlation coefficient?

A

Pearson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is an assumption of random sampling for Pearson’s r?

A

each sample is a random sample from its population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is an assumption of random sampling for the robustness of Pearson's r?
considered inappropriate to conduct if violated, but some argue ti is robust if violated
26
What is an assumption for Pearson's r in regards to independence of cases?
each case is NOT influenced by other cases
27
What is an assumption for Pearson's r in regards to the robustness of the independence of cases?
NOT robust to violations
28
What is an assumption for Pearson's r in regards to normality?
the DV is normally distributed in each population
29
What is an assumption for Pearson's r in regards to normality's robustness?
robust to violations if the sample size is large
30
What is an assumption for Pearson's r in regards to linearity?
the relationship between the two variables in the population is a linear one
31
What is an assumption for Pearson's r in regards to linearity's robustness?
not robust
32
What is the sign for adjusted correlation coefficient?
Radj
33
What is Radj?
when the number of observations is small, the sample correlation will be a BIASED estimate of the population correlation coefficient (not accurate). To correct for this, we can computed the Radj, which is an unbiased estimate of the population correlation coefficient
34
What is the formula for Radj?
Radj = square root of (1 - [(1-r^2) (N-1)/(N-2)]
35
What is the reduced formula for Radj?
Radj = square root of (1 - (N-1/N-2) (1 - r^2))
36
What is the symbol for the correlation coefficient in the population?
(p) rho
37
What is the (p) rho?
an unbiased estimate of the correlation coefficient in the population
38
What is the amount that the common hypothesis that we test for a sample correlation between X and Y in the population, denoted by p (rho)?
zero
39
Why is the p (rho) a meaningful test of the population correlation coefficient?
the null hypothesis being tested is really the hypothesis that X and Y are linearly independent.
40
Rejection of the hypothesis that X and Y are linearly independent leads to...
the conclusion that they are not independent, and there is some linear relationship between them
41
What is the sampling distribution of r when p = 0?
above 0 is ur +, and below is ur -
42
When p = 0 and the sample size is relatively large, the sampling distribution of r will be _____ with a standard error of ____
normal, sr
43
What is the formula for standard error of sr?
sr = square root of [(1 - r^2)/(N-2)]
44
What can we calculate a t-stat for a correlation coefficient as?
``` df = n - 2 t = r (square root of N-2)/(square root of 1 - r^2) ```
45
What factors influence the Pearson r?
(L) linearity (O) outliers (R) restriction of range (C) context
46
Why does linearity affect the Pearson r?
r will underestimate the relationship of a bivariate distribution by departing from linearity
47
Why do outliers influence the Pearson r?
discrepant data points, or outliers, affect the magnitude of r and the direction of the effect depending on the outlier's location in the scatterplot
48
Why does restriction of range affect the Pearson r?
other things being equal, restricted variation in either X or Y will result in a lower Pearson r and would be obtained were variation greater
49
Why does context influence Pearson r?
because of the many factors that influence r, there is no such thing as the correlation between two variables. RAther, the obtained r must be interpreted in full view of the factors that affect it and the particular conditions under which it was obtained
50
How many possible directions of causality are possible when two variables are correlated?
3
51
What is the relationship of correlation and causality?
1st variable causes 2nd 2nd variable causes 1st some 3rd variables causes both the 1st and 2nd
52
What issues are there in interpreting the correlation coefficient?
statistical significance r^2 restriction in range unreliability of measurement
53
What is r^2?
the proportion of common variance or the proportion of common variance shared by two variables, used to compare correlations
54
What is the z-distribution to do hypothesis testing and to compute confidence intervals?
z = X_ - u/ ox
55
What if we don't know o and had to estimate it from our sample?
if we repeated choose samples of size n and computer variable as s^2 = sum of (X - X_)^2 / (n-1) we will create a distribution of variances and the mean of the distribution of s^2
56
What is the formula for the t-score for a sample t-test?
t = X_ - ux / sx
57
Why is s^2 typically equal to o^2?
the sample variance computed with n - 1 in the denominator (s^2) is an unbiased estimate o fo^2
58
How do we ensure an unbiased estimator?
we use the degrees of freedom one-sample is n - 1, whereas ANOVA is n - 2
59
Who figured the t-distribution?
William Sealy Gosset
60
Who wrote Student (1908) The probable error of a mean, Biomethika?
William Sealy Gosset
61
True or false: in general t-scores are normally distributed, but the distribution of t-scores is asymmetric and has a mean of 1
false. | In general, t-scores are not normally distributed but the distribution of t-scores is symmetric and has a mean of zero
62
Who do t-distributions vary?
according to the degrees of freedom
63
Does sample size influence the t-distribution?
hell yeah
64
With ___ observations, t-distrubtion approximates z-distrubtion. What's the answer?
n = 30 or more
65
What are degrees of freedom?
a mathematical concept that involves the amount of freedom you have to substitute various values in an equation
66
Should we indicate the degrees of freedom when listing the t-value?
Yes
67
Should we indicate the degrees of freedom when listing the t-value?? And Why?
Yes, because the size of the n affects the integrity of the t-distribution
68
What kind of test do you use when you know the o?
use a sample z-test, as well as confidence intervals
69
What kind of test do you use when o is not known?
the one-sample t-test, independent samples t-test (2), or the dependent samples t-test (2)
70
What should you know about the CI with the one sample t-test?
the CIs about the mean
71
What should you know about the CIs with the independent samples t-test?
the CIs about the difference between means
72
What do you need to know about the dependent samples t-test and CIs?
CIs about the mean DIFFERENCE
73
Hirsch (1997) examined cortical activity associated with production of speech in bilingual subjects. She took 7 subjects who became bilingual relatively late in life (meaning after the age of about 10), and those who were bilingual early on. (There are some fields of psychology in which 10 year olds are considered ʺover the hill.ʺ)  The first question concerns whether cortical activity associated with language production takes place in slightly different regions of Broca’s area when speaking two different languages. The following data represent the data collected from 6 subjects who were bilingual early. The dependent variable is the distance between the centroids of those two areas What tetst did he use?
the one sample t-test
74
How do you find the one sample t-test in SPSS?
Analyze - Compare Means - one-sample t-test
75
If the test value is 0, what does that mean in an SPSS one sample t-test output?
that the mean is 0, which mean it's standardized
76
The most interesting part of Hirsch’s study was the fact that she compared early and late bilingual speakers. The data below represent the distance between centroids of activation in Broca’s area when people are thinking in two different languages. What test would be used?
the paired-samples t-test
77
How do I find the t-test for dependent means?
Analyze - Compare means - Paired-sample t-test
78
If pairs are not correlated what does that mean?
the pairs are not similar enough to warrant a test of a paired samples t-test
79
Everitt (1994) compared two different treatments for anorexia. One group represented a family therapy intervention, while the other was a control group. The dependent variable was the amount of weight gain over the course of the experiment. He had earlier shown that girls in the Family Therapy condition gained weight, but that could simply be because the girls were growing older. Comparison with a Control Group would help to clarify the results. The data shown below have been taken from Everitt and are the weight gains in the two treatment conditions. What test would you use?
an independent t-test
80
How do you find an indpendent samples t-test/
Analyze - Compare means - independent-samples t-test
81
What test do you need to do to before stating the results of an independent samples t-test?
Levene's test
82
Do you want Levene's test to be significant or not?
No, because an insignificant t-test means that the variances can be rightfully assumed to be equal, and therefore we can and use the same degrees of freedom
83
What do we assume about a one-sample t-test in regards to random sampling?
the sample is a random sample from the pop, and robustness is considered inappropriate to conduct if violated, but some argue it is robust if violated
84
What do we assume about a one-sample t-test in regards to independence of observations?
cases within the sample don't influence each other, and it is not robust to violations
85
What do we assume about a one-sample t-test in regards to normality?
the DV is normally distributed in the pop, and robust to moderate violations if the sample size is large
86
It is generally considered that psychological characteristics (like personality) and physical characteristics (like height) are not normally distributed. True or false?
False. It's the opposite
87
R^2 is:
the proportion of common variance or the proportion of common variance shared by two variables
88
R^2 is also
used to compare correlations
89
Is there a restriction in range for r^2?
Yes
90
Is r^2 considered a reliable measurement
No
91
How do you do correlations in SPSS?
Analyze - Correlate - Bivariate - ...
92
What does Pearson's product moment correlation measure?
degree of association between two variables
93
What are the conditions that Pearson's product moment correlation measures degree of association between two variables?
where both variables are: - LINEARLY correlated - measured on a CONTINUOUS scale - have a degree of NORMALITY and HOMOGENEITY OF VARIANCE allowing assumptions to be made
94
What is assumed for Pearson's correlation?
bivariate analysis has variables that are: linearly correlated continuous scale variables are normal and have a homogeneity of variance
95
What are Pearson correlation coefficients?
when both factors are continuous (i.e. interval or ratio data)
96
What are Point-Biserial correlation coefficients?
when one factor is dichotomous (nominal) and one factor is continuous (interval or ratio)
97
What is a Phi coefficient?
when both factors are dichotomous (nominal)
98
What are Spearman correlation coefficients?
when both factors are rank or ordinal data (i.e. non-normally distributed data)
99
What are Kendall' tau correlation coefficients?
when you have a SMALL data set with a LARGER number of tied ranks; a better estimate of the correlation in the population
100
How do you determine correlation in SPSS?
Analyze - Correlate - Bivariate Correlations - enter variables - use Pearson Correlation Coefficient - ensure it is a two-tailed test of significance, and that it is flagged for significant correlations
101
What is the degrees of freedom for a Pearson's correlation coefficient?
N - 2 e.g. Assignment 3 has a sample of 50 professors, therefore the correlation results would be: r(48) = X.XX, p = .XXX
102
What line is the correlation coefficient in SPSS output?
on the line Pearson correlation according to the specific variable (will be the same for each intersection between the two same variables, and be 1 for the same variable)
103
What is important to note in SPSS output?
the asterisk to the right of the correlation indicates that the result is stat significant at the .05 alpha level, two-tailed (*) or the .01 alpha level, 2-tailed (**)
104
How do you find out the power of a Pearson's correlation in SPSS?
use the Sample Power function and enter the correlation coefficient in the Population correlation area, enter the ACTUAL number of cases in the N of Cases area, and also enter another group under population as TEST AGAINST THE CONSTANT with a Population Correlation of 0.00. the power will show a standard error, be indicated as 95% lower, 95% upper, and the power in the bar will be indicated under this
105
What is an example of a correlation analysis results in APA format?
In this study, the relationship between a husband's flexibility in his gender role and a wife's degree of marital satisfaction was assessed in a RANDOM SAMPLE of eight married couples from one city. There was a stat significant, positive relationship, r(df) = .76, p < .001, between the two variables: couples with a high score on one varaible tended to have a higher score on the other. The 95% confidence interval for p, the population value of the relationship between male gender role flexibility and female marital satisfaction, ranges from .12 to .95 when expressed in Pearson correlation coefficient units. This WIDE confidence interval indicates UNCERTAINTY about the strengthof the relationshp between gender role flexibility and marital satisfaction in the population.
106
How do you represent power in a correlation analysis?
E.g. While the power of the correlation analysis is 0.97, the small sample size in the present study makes it impossible to say how strong the relationshp between these two variables is in the larger population. The study should be replicated with a larger sample size in order better to determine the strength of the relationship.
107
What is an example of showing what a correlational study doesn't address?
E.g. This is a correlational study, so it does not address whether (1) more gender flexible husbands lead to more satisfied wives, or (2) wives who are more satisfied give their husbands the leeway to be more gender flexible, or (3) that a third variable -- such as age, education level, or socioeconomic status -- could influence both marital satisfaction and gender role flexibility. Future research should attempt to determine the order of the relationship.
108
How to show a correlation matrix?
Table (N = x) Variable 1 2 3 4 5 1. A - 2. B .08 - 3. C .08 .10 - 4. D - 5. E - p < or = .05, two-tailed
109
What is the formula for power with a correlational study?
italicized d/power = u - u0/o e.g. H1 people with psych problems have higher IQs, N = 25, u0 = 100, u1 = 105, o = 15 = 105 - 100/15 = 5/15 = 1/3 delta/Greek d = d square root of n e.g. 1/3 x square root of 25 = 1.67 delta is converted on power table to become Greek d 0.05 1. 60 0.36 1. 70 0.40 therefore, power is 0.38
110
What do you use to find sample power?
Appendix Power: Power as a Function of Greek d/delta and Significance Level/alpha
111
What is the formula for delta?
d x square root of n
112
What is the formula for d?
= u - u0 / o
113
What is G*Power?
software in SPSS that calculates the power from delta
114
What is used in the G*Power calculation?
tails effect size/d alpha error problem sample size [test family (t tests) Stat test: Means difference from constant (one sample case) Type of power analysis: Post hoc - compute achieved power - given alpha, sample size, and effect size]
115
What does a power of .38 mean? H1 people with psych problems have higher IQs, N = 25, u0 = 100, u1 = 105, o = 15
If H0 is false and u1 is really 105, only 38% of the time can a clinician expect to find a statistically significant difference between her sampl e mean and that specified by H0. A prob of .38 is rather discouraging because it means that if the true mean really is 105, 62% of the time a clinician will make a Type II error
116
How does sample size affect the probability that the interval will contain the population mean?
sample size has NO effect on the probability the interval will contain the population mean
117
True or false: Larger sample sizes decrease the width of confidence intervals but leave the probability that the interval contains the population mean unchanged.
True
118
What are the assumptions about data to create confidence intervals?
The two populations have the same variance. This assumption is called the assumption of homogeneity of variance. 2. The populations are normally distributed. 3. Each value is sampled independently from each other value.