Interpreting Data Flashcards

(50 cards)

1
Q

what are the two types of data

A
  • Qualitative

- Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

describe what qualitative data splits into

A

Qualitative data splits into nominal (unordered) and ordinal (ordered e.g. short medium tall)
- Nominal this then split into binary (yes or not questions) and categorical (e.g. different colours)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the name of the data that is unordered in qualitative data

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the name of the data that is ordered in qualitative data

A

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is quantitative data split into

A
  • discrete ( 10 graduates - whole number)

- Continuous ( length in cm - doesn’t have to be a whole number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are two other ways in which you can summarise data

A

Measure of location

measure of spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What makes up the measure of location

A
  • Median = Middle value when the values are ordered from smallest to largest
  • Mode = the most common value
  • Mean = average = sum of all of the values divided by the number of values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What makes up the measure of spread

A
  • standard deviation

- interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When is it better to use the median over the mean

A
  • Better to use median in order to avoid the influence of outliers (large or very small numbers that can be incorrect in the data)
  • Also use it when the data is skewed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When is it better to use the interquartile range over standard deviation

A
  • Use the interquartile range in order to avoid the influence of outliers
  • also used when the data is skewed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you work out the interquartile range

A

range is between the 25th and 75th percentile

e.g. 1,  2,  3,  4,  5,  6,  7
Interquartile range (IQR) = 2 to 6
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you work out the standard deviation

A
  • work out the mean
  • then from each result subtract the mean and square the result
  • then divide by N (number of participants)
  • then square root it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is AFP levels important

A

If you aren’t pregnant, an AFP test can help to diagnose and monitor certain liver conditions, such as liver cancer, cirrhosis, and hepatitis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an antenatal thyroid screening test

A
  • this is a test that screens thyroid and therefore is able to prevent defects in the babies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is another name for the Gaussian distribution

A

normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what two things is the normal distribution determined by

A
  • Normal distribution is determined only by the mean and standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What happens if you change the mean to the normal distribution curve

A
  • the curve moves left and right but stays the same height - if it decreases it moves to the left whereas if it increases it moves to the right
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What happens if you change the standard deviation to the normal distribution curve

A
  • the height of the curve changes but the area under the curve remains the same
  • as the number increases the curve becomes more flattened
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the characteristics of Gaussian distribution

A

• A constant proportion of values will lie within any specified number of Standard Deviations above or below the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What standard deviation correlates to the

  • 99% range
  • 95% range
  • 90% range
A

99% range (0.5th to 99.5th centile) = mean ± 2.58 SDs
95% range (2.5th to 97.5th centile) = mean ± 1.96 SDs
90% range (5th to 95th centile) = mean ± 1.64 SDs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you calculate the 95% percentile

A

Mean +- 1.96 x standard deviation

22
Q

what is statstics used for

A
  • Statics used for our sample to tell us something about the population
23
Q

What does the population contain

A

Population contains the true mean

24
Q

What happens if the sample size is large enough

A

If the sample size isn’t too small then the distribution of the sample mean will be Gaussian

25
what is the standard deviation of the sample size
the standard error of the mean
26
What is standard error
The standard error is a measure of the statistical accuracy of an estimate
27
What is the standard error of the mean
- The standard error of the mean is the standard deviation of the distribution of all possible sample means
28
How do you work out the standard error of the mean
= Standard deviation/ square root of sample size
29
How do you work out a confidence interval
95% confidence interval = sample mean +- 1.96 x standard error
30
Define the confidence interval
a range of values so defined that there is a specified probability that the value of a parameter lies within it.
31
How would you right about the confidence internval in an exmaple
IN THE POPULATION we are 95% sure that the mean weight could be as low as 75kg or as high as 81kg
32
When do we use standard deviation
- use standard deviation for ranges (for individual values)
33
When do we use standard errors
- use standard error for confidence intervals (for means)
34
What happens as the sample size increases
- As the sample size increases the 95% confidence interval gets narrower, this is because the standard errors get smaller - Increase in accuracy therefore you can be more confident in the accuracy of our estimate
35
Describe the different types of correlation and there numbers
- R = 0 - no correlation - R = 1 – perfect positive correlation - R = -1 – perfect negative correlation
36
What is the correlation coefficient
- R = 0 - no correlation - R = 1 – perfect positive correlation - R = -1 – perfect negative correlation
37
define the correlation coefficient
a number between +1 and −1 calculated so as to represent the linear interdependence of two variables or sets of data.
38
How do you work out linear regression
``` Y = a + bx Y = outcome (deponent variable) X = predictor (independent variable) a = the point at the line crosses the X axis ```
39
What is the dependent variable
a variable (often denoted by y ) whose value depends on that of another
40
What is the independent variable
a variable (often denoted by x ) whose variation does not depend on that of another
41
Why do you want to know if the result is statistically significant
* An observed sample difference between groups might be due to chance * We want to know whether a result is statistically significant i.e. unlikely to be due to chance
42
How do you determine if the result is statistically significant
• To determine whether an observed difference was due to chance we look at confidence intervals and p-values
43
How do you work out the confidence intervals between two groups
95% CI = mean difference ± 1.96 × SE of mean difference
44
What is a P value
a p-value for a result is the probability of observing a result as or more extreme than the sample result if the underlying assumption in the population is true
45
When is a confidence interval result significant
- Doesn’t cross 0 therefore there is a difference in the population - If the confidence interval crossed 0 then there might not be a difference
46
When is a P value statistically significant
when the value calculated is less than 0.05
47
When can P values be calculated
When there is a comparison - 2 means – are they different i.e. is their difference different from 0? - Association – are the observed results different from those expected - Regression – is the slope different from 0?
48
Where does the P value come from
The p-value comes from a chi-squared test. P=0.002, so we can be confident there is an association
49
What is the chi squared test used for
categorical variables
50
What is a T test used for
Comparing continuous variables