Quantitative Revision Flashcards

(62 cards)

1
Q

In what circumstances would you perform a simple linear regression test?

A

To determine if there are linear relationships/associations between ratio/interval variables i.e. X and Y

Enable prediction of the values of Y (DV) from the values of X (IV)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What assumptions must be met in order for you to use the simple linear regression test with your data?

A

Ratio/interval data

Linear relationship between X and Y

Data are randomly sampled

No outliers amongst data

Residuals must be approximately normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What would be an appropriate null and alternative hypotheses for the simple linear regression test?

Non-directional (two-tailed)
Directional (one-tailed)

A

H0: There is no linear relationship between X and Y.
H1: There is a linear relationship between X and Y.

H0: There is no positive linear relationship between X and Y.
H1: There is a positive linear relationship between X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe what the results mean for a simple linear regression test./
Interpret the results
Write-up of conclusion and results

A

Standardized coefficient
r: strength of the relationship between X and Y (with 1 being the strongest)
Beta: predictedeffectonYif X increases by 1 SD –> When X increasesby1SD,Yispredictedtoincreaseby.85SDs UsefulwheretherearemultipleIVs(inmultipleregression)

r^2: represents the variability in Y that can be explained by X

Unstandardized coefficient

b: For every increase in 1 unit of X, Y increases by b units
a: only interpret this if it makes sense/there is meaning/it is useful in knowing the value of Y when X = 0

significance (sig.) (i.e.p-value).: tells us the significance of
association between X and Y
effect of X on Y
The statistical significance associated with height matters
IGNORE the statistical significance associated with the constant

Be sure to answer in terms of the question and its scenario

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In what circumstances would you perform a Pearson’s (r) correlation test?

A

To determine the (strength and direction of an) association between 2 variables i.e. X and Y, where neither is categorical, but instead continuous outcome:
ratio/interval(parametric) e.g. weight (kg)
ordinalscale(non‐parametricequivalent) e.g. world ranking No.1, No.5 etc.

Parametric data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What assumptions must be met in order for you to use the Pearson’s (r) correlation test with your data?

A

X and Ymustberatio/interval

Linearassociation between X and Y(scatterplot)

Theassociationmustshowhomogeneity of variance(scatterplot), wherethedatapointsareevenly distributedalongtheregressionline

Data for X and Y should follow a normal distribution (histogram, box plot, normal probability Q-Q plot, skewness and kurtosis z-scores, mean = median)

No outliers (scatter plot, box plot)

Ideally,shouldonlybeused withasampleofn>=100
[Forsmallersamplesizes,thereisariskthatoneortwo extremedatapoints‘drive’theassociation]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What would be an appropriate null and alternative hypotheses for the Pearson’s (r) correlation test?

Non-directional (two-tailed)
Directional (one-tailed)

A

H0: There is no association between X and Y.
HA: There is an association between X and Y.

H0: There is no positive association between X and Y.
H1: There is a positive association between X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe what the results mean for a Pearson’s (r) correlation test./
Interpret the results
Write-up of conclusion and results

A

The results show a significant/non-significant (significance) weak/strong (strength) negative/positive (direction) correlation between X and Y

r: represents the strength of the relationship/association between X and Y

sig (i.e.p-value).: tells us the significance of the association between X and Y

r^2: represents the variability in Y that can be explained by X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In what circumstances would you perform a Spearman’s (rho) test?

Spearman’s rho calculates the ranked scores for each variable and considers the association between the ranks

A

To determine the (strength and direction of an) association between the ranks of X and Y, where X and Y are both non-categorical (i.e. not ordinal)

Non-parametric data i.e. parametric assumptions have been violated/breached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What assumptions must be met in order for you to use the Spearman’s (rho) test with your data?

A

X and Ymustberatio/interval

Association between the ranks of X and Y does not need to be linear but it must be monotonic (i.e. does not change direction) (scatterplot)

Theassociationmustshowhomogeneity of variance(scatterplot), wherethedatapointsareevenly distributedalongtheregressionline

Onlyappropriatewheren (samplesize) is at least 20 or more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What would be an appropriate null and alternative hypotheses for the Spearman’s (rho) test?

A

H0: There is no association between the ranks of X and Y.
H1: There is an association between the ranks of X and Y.

H0: There is no positive association between the ranks of X and Y.
H1: There is a positive association between the ranks of X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe what the results mean for a Spearman’s (rho) test.

A

The results show a significant strong positive correlation between the ranks of X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In what circumstances would you perform a Kendall’s (tau) test?

A

To determine the (strength and direction of an) association between the ranks of X and Y, where X and Y are both non-categorical (i.e. not ordinal)

Non-parametric data (data is not normally distributed) i.e. parametric assumptions have been violated/breached

Useful with small data set n < 20

Can deal with a large number of tied ranks in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What assumptions must be met in order for you to use the Kendall’s (tau) test with your data?

A

Bothvariablesmustberatio/interval

Association between the ranks of X and Y does not need to be linear but it must be monotonic (i.e. does not change direction) (scatterplot)

Theassociationmustshowhomogeneity of variance(scatterplot), wherethedatapointsareevenly distributedalongtheregressionline

Onlyusefulwheren < 20

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What would be an appropriate null and alternative hypotheses for the Kendall’s (tau) test?

A

H0: There is no association between the ranks of X and Y.
H1: There is an association between the ranks of X and Y.

H0: There is no positive association between the ranks of X and Y.
H1: There is a positive association between the ranks of X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe what the results mean for a Kendall’s (tau) test.

A

The results show a non-significant weak negative correlation between the ranks of X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In what circumstances would you perform a multidimensional Chi-Square test?

A

Relationship/association between variables (Test of association)

Variables are both categorical i.e. nominal

Independent research design (No subjects/participants appears in > one group)

[Compare the observed and expected counts i.e. Test for differences where samples are independent]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What assumptions must be met in order for you to use the multidimensional Chi-Square test with your data?

A

Randomly sampled

Variables must be categorical i.e. nominal

Independentmeasures

Counts(actualnumbers), notpercentages

No calculatedexpected value < 1

No > 20% of expected values < 5

Solution=collect more data, collapse categories, or use an exact test (SPSS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What would be an appropriate null and alternative hypotheses for the multidimensional Chi-Square test?

ResearchQuestion: Does the proportion of athletes who are normal weight or overweight differ by sport?

A

(H0):Inthepopulation,thethreesportsdo not differ in the proportions who are normal and overweight.

(H1):Inthepopulation,thethreesportsdo differ in the proportions who are normal and overweight.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Describe what the results mean for a multidimensional Chi-Square test./
Interpret the results
Write-up of conclusion and results

A

Method
A Chi-square test was performed to test the H0 that the 3 sports do not differ in the proportions who are normal and overweight

Results
There was a difference between the proportion of those athletes who are normal and those who are overweight in the 3 sports (Field, Netball and Rowing), Chi-Square statistic = … (df = …, n = …), p = …

Basically:
Method: Test was performed to test the H0
Results: 
Conclusion/result
Chi-Square statistic
df
n
p-value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In what circumstances would you perform a McNemar’s (Chi-Square) test?

A

Relationship/association between variables (Test of association)

Variables are both nominal

Repeatedmeasuresdesignwithtwo dichotomous variables

[Test for differences where samples are paired]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What assumptions must be met in order for you to use the McNemar’s test with your data?

A

Randomly sampled

Dependent/repeated measures

DV and IV must be
dichotomous
of only 2 categories each

Variables must be categorical i.e. nominal

Counts(actualnumbers), notpercentages

No calculatedexpected value < 1

No > 20% of expected values < 5

Solution=collect more data, collapse categories, or use an exact test (SPSS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What would be an appropriate null and alternative hypotheses for the McNemar’s test?

Research question: To investigate the number of correct identifications of the writer’s sex by their handwriting style

49Psychologystudentswereaskedtowriteusingtheir normal handwritingandthenaskedtowriteimitatingthe handwritingoftheopposite sex
Students recruited a participant to judge the handwriting of both samples and identify the sex (repeatedmeasures)

IV: handwritingstyle
DV:participant’sjudgementof handwriter’ssex

A

H0: There will be no difference in the number of correct identifications of the writer’s sex from the 2 handwriting samples.

H1: There will be a difference in the number of correct identifications of the writer’s sex from the two handwriting samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Describe what the results mean for a McNemar’s test.

A

Method
A McNemar’s Chi-Square test was performed to test the H0 that there will be no difference in the number of correct identifications of the writer’s sex from the two handwriting samples

Results
There is a significant difference in the number of correct judgements between the two conditions of handwriting style (n = …, exact p = …)
Of the 49 participants, ‘..’ correctly identified the handwriter’s sex for normal writing. Of the ‘…’ who were incorrect for the normal handwriting, ‘…’ of them correctly identified the handwriter’s opposite handwriting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
In what circumstances would you perform an independent samples design t-test?
Parametric data Independent (i.e. different) data/groups/samples To compare means - compare sample mean to another sample mean i.e. to compare differences between groups (mean) e.g. Intervention and control group --> study participant is in one group only [independent data: data that comes from different (independent) groups of people]
26
What assumptions must be met in order for you to use the independent t-test with your data?
Dependent Variable is ratio/interval Measurements in condition 1 are independent of  measurements in condition 2 For n < or equal to 30 --> distribution of DV data for each group (X and Y) should not be badly skewed i.e. should follow a normal distribution (Can use CLT to help explain, if we still remember) Homogeneity of variance: The variance of the DV data for the two groups should not be very different A problematic difference in variances is indicated by a significant Levene’s Test If significant, interpret the p-value associated with ‘equal  variances not assumed’ If non‐significant, interpret p-value associated with ‘equal  variances assumed’
27
What would be an appropriate null and alternative hypotheses for the independent t- test? two-tailed one-tailed
H0: There is no difference between the population means of X and Y. H1: There is a difference between the population means of X and Y. H0: The population mean of X not > population mean of Y. H1: The population mean of X > population mean of Y.
28
Describe what the results mean for an independent t-test.
p < or equal to 0.05 or 0.01 --> There is a significant difference between the population means of X and Y or The population mean of X is significantly > population mean of Y
29
In what circumstances would you perform a paired design t-test?
Parametric data Dependent/paired (i.e. same) data/groups/samples To compare means - compare sample mean to another sample mean i.e. to compare differences within groups (mean) e.g. pre-test post-test study Data collected from an/the same individual at different points in time/under different conditions Compare differences in outcome between time 1 & 2 or condition 1 & 2 (mean) [dependent/paired data: data that comes from one group of individuals]
30
What assumptions must be met in order for you to use the paired t-test with your data?
Dependent Variable is ratio/interval Observations not independent Each measurement in Condition/TIme 1 has a match in  Condition/Time 2 For n < or equal to 30 --> distribution of differences between X and Y (i.e. X - Y) should not be badly skewed i.e. should follow a normal distribution (Can use CLT to help explain, if we still remember) Homogeneity of variance
31
What would be an appropriate null and alternative hypotheses for the paired t-test? two-tailed one-tailed
H0: No difference in the means before and after. H1: A difference in the means before and after. H0: Mean after < or equal to mean before. H1: Mean after > mean before. or H0: Mean difference = 0. H1: Mean difference is not = 0. H0: Mean difference is not positive. H1: Mean difference is positive.
32
Describe what the results mean for a paired t-test.
p < or equal to 0.05 or 0.01 --> Significant difference between the means before and after or Mean after is significantly > mean before
33
In what circumstances would you perform a Mann Whitney U test?
Non-parametric data: Ordinal scale DV Ratio/interval DV that does not meet parametric assumptions (Sample sizes are small and normality is questionable Data contain outliers that because of  their magnitude distort the mean values and affect the outcome of the comparison) Independent (i.e. different) data/groups/samples To compare mean ranks/medians - compare sample medians to another sample median i.e. to compare differences between groups (median) e.g. Intervention and control group --> study participant is in one group only [To test the H0 that  2 samples come from the same population (i.e. have the same median) observations in one sample > than observations in the other]
34
What assumptions must be met in order for you to use the MWU test with your data?
Independent data/samples Data distributions of X and Y are the same shape Not too many ties in ranks of data [Data values are assigned ranks relative to both samples  combined]
35
What would be an appropriate null and alternative hypotheses for the MWU test? Two-tailed One-tailed
H0: There is no difference between the population medians of X and Y. H1: There is a difference between the population medians of X and Y. H0: The population median of X not > population median of Y. H1: The population median of X > population median of Y.
36
Describe what the results mean for a MWU test.
p < or equal to 0.05 or 0.01 --> There is a significant difference between the population medians of X and Y or The population median of X is significantly > population median of Y
37
In what circumstances would you perform a Wilcoxon signed rank test? [A Wilcoxon signed rank test: Measures the differences between each variable Compares paired data Is used when you cannot justify a  normality  assumption  for the differences Very simple --> counts the number of differences that are positive (+) and those that are negative (‐) and makes a decision based on these counts]
Non-parametric data Dependent/paired (i.e. same) data/groups/samples To compare medians - compare sample medians to another sample median i.e. to compare differences within groups (median) e.g. pre-test post-test study Data collected from an/the same individual at different points in time/under different conditions Compare differences in the ranks of the outcome between time 1 & 2 or condition 1 & 2 (median)
38
What assumptions must be met in order for you to use Wilcoxon test with your data?
Paired/dependent data/samples Non-categorical data
39
What would be an appropriate null and alternative hypotheses for the Wilcoxon test?
H0: No difference in the medians before and after. H1: A difference in the medians before and after. H0: Median after < or equal to median before. H1: Median after > median before. or H0: Median difference = 0. H1: Median difference is not = 0. H0: Median difference is not positive. H1: Median difference is positive.
40
Describe what the results mean for a Wilcoxon test.
p < or equal to 0.05 or 0.01 --> Significant difference between the medians before and after or Median after is significantly > median before
41
What is a type I error?
False positive Incorrectly rejecting the H0 when it is actually true Saying that there is a difference when in reality/actually there is no difference e.g. Telling a man that he is pregnant
42
What is a type II error?
False negative Incorrectly failing to reject i.e. accepting the H0 when it is actually wrong Saying that there is no difference when in reality/actually there is a difference e.g. Telling a pregnant women that she is not pregnant (when it is so obvious that she is!)
43
What is the common structure of all statistical tests?/What are the 7 steps of hypothesis testing?
Set H0 and H1 Establish alpha i.e. level of significance Determine p-value Accept or reject H0 OR Define study question and choose an inferential test Set hypotheses Select/establish level of significane i.e. alpha = 0.05 EDA and assess test assumptions to see if they are met/satisfied Go ahead and run the test Obtain p-value Decide whether to reject or accept H0 + conclusion, interpretation and write-up of results
44
What is the benefit of using a paired t-test over an independent t-test?
Independent t-test gives rise to more random error because the control group might, by chance, be very different from the treatment group Variation is limited in paired t-test as each person is their own control
45
What are residuals?
= Predicted - actual value of y Difference between the predicted value of Y (line) and the actual value of Y (points) An observable estimate of the unobservable statistical error
46
What is the simple linear regression equation?
Y = a + bX i.e. DV = constant +  coefficient x (IV) a: constant or intercept b: coefficient or slope of the line associated with this independent variable As X increases by 1 unit, Y increases by b unit
47
What does r^2 = 0.8 mean?
80% of variability in Y is explained by X *Note: In an exam, interpret the Adjusted R Square (if it is given) as it is more accurate
48
What is the assumption that all inferential tests make about the sample?
The sample is randomly sampled from the population
49
What is heteroscedasticity?
No linearity Data points fan out, does not go along regression line (evenly)
50
How do we obtain the p-value for one-tailed test (directional) from the p-value of/for two-tailed test (non-directional)?
p-value for one-tailed test = Half the p-value for two-tailed test
51
What is the difference between one-tailed and two-tailed tests with regard to rejecting the H0?
Two-tailed tests are non-directional. We would reject H0 if we found a positive or negative association or difference etc. One-tailed tests are directional. We only reject H0 if the association or difference etc. is in the direction that we specified/expected
52
What does the multidimensional Chi-Square test compare?
Compares observed frequencies in our sample with the frequencies we would expect if there were no relationship at all between the two variables in the population that the sample was drawn from
53
What is the formula for Chi-square?/How do we obtain a Chi-square statistic?
Chi-Square = SUM((O‐E)^2/E) O: observed count E: expected count For each cell, apply the formula (O-E)^2/E Then sum up all the cells to get the Chi-Square statistic
54
What is (the concept of) degrees of freedom? How do we calculate it?
The more categories there are in the IV and DV, the more chance there is of the analysis being affected by sampling error (No. of categories in the row variable minus 1) x (No. of categories in the column variable minus 1) i.e. (rows-1)(columns-1) EXCLUDE marginal cells!
55
From the study done by Chris Gratton and Ian Jones on Research methods for Sports Studies (2008), what are the 4 purposes of data analysis?
Describe Compare Examine similarities Examine differences
56
What are the aims of Descriptive statistics?
Check for errors and outliers Describe and summarise the data Spread of the data Ensure appropriate analysis Data parametric or non-parametric?
57
Ways of summarising interval/ratio data
Measure of Central Tendency mean median mode Measure of Dispersion range SD variance Normal curve, skewness, kurtosis
58
What do parametric tests assume about the characteristics of the sample in terms of its distribution?
Data is drawn from a normally distributed population (i.e. data is not skewed) Have the same variance or spread on the variables being measured
59
What assumptions do non-parametric tests make about the characteristics of the sample in terms of its distribution?
Do not make any assumption
60
What is p-value?
Exact probability that H0 is true Probability that the difference found occurred by chance
61
When do we use non-parametric tests?
When assumptions of parametric tests are not met (i.e. breached) level of measurement (e.g.,interval or ratio data) normal distribution homogeneity of variances across groups Not always possible to correct for problems with the distribution of a data set (i.e. data transformation) --> have to use non‐parametric tests: Make fewer assumptions about the type of data on which they can be used Many of these tests use “ranked” data
62
What is alpha/level of significance?
The chance of making a Type 1 error and tolerating it Alpha level of .05 (5%), decide to reject H0 and accept HA when p-value is no more than .05 --> up to 5% chance that you are wrong in concluding that there is a difference (making a Type 1 error) when there actually isn't (false positive)