SPSS Flashcards

(100 cards)

1
Q

What is a continuous variable?

A

Arising from measurements (e.g. height)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a discrete variable?

A

Arising from counting (e.g. number of books on a bookshelf)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a nominal/categorical variable?

A

Having no natural order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an ordinal variable?

A

Having natural order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is simple random sampling?

A

NPS: Each member of the population has equal chance of being selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is systematic sampling?

A

NPS: Every nth subject from a population list is chosen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is stratified random sampling?

A

NPS: The population is split into groups of similar individuals from which a sample is drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is disproportionate sampling?

A

NPS: If strata in population are of substantially unequal size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Cluster sampling?

A

NPS: Successive random sampling a series of units in a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is convenience sampling?

A

PS: Samples are based on availability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is quota sampling?

A

Researcher guides sampling process until participant quota is met (e.g. volunteers called for until equal quota of males/females is met)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is purposive sampling?

A

Subjects are hand picked based on certain criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is snowball sampling?

A

Used when desired characteristics are rare. Initial subjects refer others with similar characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens to your accuracy if you quadruple your sample size?

A

It doubles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Name 7 types of Experimental designs

A

RCT, Blind study, Cross over design (each subject has own control but order of treatments is randomised), Factorial design (several factors compared at once), outcome variables, Quasi experimental design (Often happens when independent variable in question is an innate characteristic of the participants involved), single subject study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Name 9 types of observational designs

A

Retrospective, prospective, surveys and polls, observation, longitudinal cohort studies, case-controlled study, cross sectional study, case reports, questionnaires.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is dichotomous survey questioning?

A

Two possible answers - yes/no/agree/disagree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is likert scale in surveys?

A

3-5 categories of responses usually provided

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is visual analogue scale in survey?

A

Results measured along a continuum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why can histograms be subjective?

A

Dependent on number of bins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the interquartile range?

A
Minimum = 1st quartile
0.25 = 2nd quartile
Median = 3rd quartile
0.75= 4th quartile
Maximum = 5th quartile
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which chart/graph best displays the interquartile range?

A

Boxplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What type of data best suits bar chart?

A

Categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the standard deviation

A

How far away values deviate from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the formula for degrees of freedom
n-1
26
What is the standard error?
How far the sample mean deviates from the population mean
27
What is a parameter?
A numerical characteristic of a population
28
What is a statistic?
A numerical characteristic of a sample (e.g. mean, SD)
29
What are the confidence intervals associated with a normal distribution?
68% of values within 1 SD of the mean 95% of values within 2 SD of the mean 99.7% of values within 3 SD of the mean
30
What is the central limit theorem?
The Central Limit Theorem states that the sampling distribution of the sampling means approaches a normal distribution as the sample size gets bigger- no matter what the shape of the population distribution
31
What is a Type 1 Error?
Null hypothesis is rejected when it is actually true (False positive)
32
What is the relationship between Type 1 Error and the P-Value?
The probability of making a type 1 error is precisely the significance level we set our p-Value at
33
What is a Type II Error?
Where we don't reject the Null hypothesis and we should have
34
What is Power?
The probability of detecting an effect when there is indeed an affect
35
How can power be improved?
Decreasing effect size, decreasing variability, Increasing sample size, decreasing the significance threshold (but this can increase Type I Error)
36
What is a Parametric test?
Tests some parameter in your population
37
What is a Non-Parametric Test?
Looks at some comparison between groups, such as comparing the "ranks" of values instead of the values themselves.
38
What are the three Parametric test assumptions?
1. Normality: Data have normal distribution 2. Homogeneity of variances: Data from multiple groups have the same variance 3. Independence: Data are independent
39
What does a p-Value of <0.05 for Levene's test tell you?
That the Variances are not equal and a parametric test cannot be performed.
40
What does a p-Value of >0.05 for Levene's test tell you?
That there is less than 5% chance that the equality in the variances occurred by chance.
41
What is the purpose of a t-test?
To compare the means between two independent groups on the same, continuous, dependent variable.
42
What is the NULL hypothesis for a t test?
That the difference between the two means is zero
43
What type of data do you need to run a t-test?
One independent categorical variable and one continuous, dependent variable
44
What is the non-parametric equivalent of the independent t-test?
Mann-Whitney U Test
45
What does ANOVA do
Measures the difference between means
46
What type of data does ANOVA require
One categorical, independent variable and one dependent continuous variable
47
What is the Null hypothesis for ANOVA?
That there is no difference in the means of the groups
48
What is the F value in ANOVA?
The variability between the means / variability within the sample. i.e. Is the variability between group means larger than the variability of the observations within the groups
49
What does a large F value signify
That the variance between the groups is more than the variance within the groups. A high F value means that your data does not well support your null hypothesis
50
What test do you use to determine more specific difference between groups?
Tuckey post-hoc analysis
51
What is the non-parametric equivalent of ANOVA
Kruskal-wallis Test
52
What is a residual
The difference between an observed response and the value predicted for the response by our model
53
How do you calculate a residual value?
Residual for an observed value is the difference between that variable and the mean
54
Residual degrees of freedom?
n-2
55
What is another name for a residual?
Prediction error
56
What does a high correlation coefficient signify
Likely association
57
What is the R value?
Pearson correlation coefficient
58
What is the R-squared value
Statistical measure of how close the data are to the fitted regression line. e.g. if R2 = .97, this means that 97% of the variance in the dependent variable can be attributed to the independent variable.
59
What is the Null hypothesis of regression
That the underlying slope equals zero
60
What does the P-Value signify in regression?
The probability of getting an association by chance when there is no association
61
What are the assumptions for Linear regression?
Independent observations Linear association Normal variability Equal variances
62
What does an ANOVA regression p-value <0.05 tell us?
There is strong evidence against the null hypothesis of 0 slope
63
How do you calculate the sample size for an ANOVA from SPSS?
TOTAL df + 1
64
Can you use ANOVA for linear association?
The ANOVA ideas extend from comparing means to testing for linear association
65
Which test would you immediately think of if you saw the terms "relationship between" in the question
Correlation
66
Which test would you immediately think of if you were asked to compare means of two groups or one group and 2 variables?
t-test
67
Which test would you immediately think of if you were asked to compare the means of more than two groups or multiple variables?
ANOVA
68
When would you use a Welch's ANOVA
When you have normally distributed data that violates the assumption of homogeneity of variance
69
What is the nonparametric equivalent of Pearson's Correlation?
Spearmans Correlation or Chi Squared
70
What is the nonparametric equivalent of the dependent t-test
Wilcoxon Signed Rank Test
71
Which test would you use for categorical outcome?
Chi Squared
72
What test would you use for multiple variable comparison in two or more groups?
MANOVA
73
What are some limitations of Pearson Coefficient?
Presence of outliers Linearity (if plot is curved) Limited range of scores will limit generalisation Does not imply cause
74
What are confounding variables?
"lurking" variables which may be influencing the two variables of interest
75
What type of test would you perform when you have a scale (response) and Nominal (predictor) variable?
ANOVA, Independent Samples t-test, Mann-Whitney U test , Kruskall Wallis test
76
What kind of test would you perform when you have a Nominal (response) and Nominal (predictor) variable?
Chi Squared
77
What kind of test would you perform when you have a scale (response) and scale (predictor) variable?
Regression (ANOVA F-test or Coefficient t-test), Pearson correlation t-test, Spearman correlation
78
What kind of test would you perform if you had a scale (response) variable and no predictor variable?
One Sample t-test or Paired t-test
79
What is the Null hypothesis for Chi Squared?
The distribution is the same across x groups
80
What is the purpose of Chi Square?
To compare the observed counts with the counts we would expect if the Null hypothesis was true. Comparing expected and observed values
81
What is inter-rater reliability?
The degree to which ratings given by different observers agree
82
What is intra-rater reliability?
The degree to which ratings given by the same observer on different occasions agree
83
How do you measure intra/inter-rater reliability, taking chance into account?
Cohens Kappa (k)
84
What does the kappa value tell you?
Measure of agreement: Percentage of times results agreed and this did not take place by chance
85
What is an acceptable k score?
0.4 and above
86
How do you measure if a scale is internally consistent?
Cronbach's alpha (a)
87
What does a value of zero represent for Cronbach's alpha?
Internal consistency reliability is very low and consistency cannot be assumed
88
What is the acceptable score for Cronbach's alpha?
0.8
89
What is sampling error?
error in a statistical analysis arising from the unrepresentativeness of the sample taken.
90
What are three limitations of convenience sampling
- Possible bias - Poor generalisability - Potential for sampling error
91
Why is it crucial to discuss attrition?
Attrition of the original sample represents a potential threat of bias if those who drop out of the study are systematically different from those who remain in the study.
92
What is the b value in regression?
The b value is the gradient of the regression line. The b value (on the second line of SPSS) tells you "if the other variable is increased by one point, the result will go up by "b")
93
What does Central Limit Theorem tell us about in statistical inference?
Since t tests and ANOVA are based on assuming the sample means have Normal distributions, this means that we can use these methods even if the data seem slightly skewed, particularly if the sample sizes are large.
94
What do you need to remember when describing a relationship from a scatterplot?
Strength of association, direction/shape (pos/neg) and linearity
95
Would you go ahead with further statistical testing if a scatterplot showed moderate relationship, but had a number of points that deviated from the line?
Yes, but may be necessary to try non-parametric test. Strength of association may be stronger than anticipated due to involvement of values that deviate from line.
96
Can you imply cause from an observational study?
No, it is very difficult
97
What is Sampling variability?
Sampling variability refers to the process whereby statistics, such as the sample mean, would give different results if the random sampling process was repeated. We thus need to account for sampling variability when making any conclusions from our data.
98
What does the Central Limit Theorem say about statistical inference?
The Central Limit Theorem says that the distribution of the sample mean is approximately Normal for sufficiently large samples. Since t tests and ANOVA are based on assuming the sample means have Normal distributions, this means that we can use these methods even if the data seem slightly skewed, particularly if the sample sizes are large.
99
Why might we prefer parametric over non parametric test?
If assumptions are satisfied then parametric tests are more powerful than their non- parametric counterparts (although the difference can be minor). Parametric tests also provide direct estimates for effects, including confidence intervals. Nonparametric tests come with their own assumptions too.
100
What is the least squares line?
A way of fitting the data with the line that minimises the sum of the squared residuals.