Statistics Flashcards by Monica Tryczyńska-Palmer

Types of data

Qualitative which is categorical

Quantitative = numerical

How well did you know this?

Not at all

Perfectly

Types of qualitative data

Nominal - named - eg. ethnicity, blood groups
Ordinal - ordered/ranked eg. Pain score 1-10,

How well did you know this?

Not at all

Perfectly

Types of quantitative Data

Discrete - whole numbers eg. Number of Antenatal visits
Continuous - any number eg. Birthweight

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

Descriptive statistics

Measures of central tendency :Mean median mode

Measures of dispersion: Range, interquartile Range, Standard, deviation

Normal distribution and skewness

How well did you know this?

Not at all

Perfectly

Mean

Average

Good for symmetric distributions

How well did you know this?

Not at all

Perfectly

Median

Middle value
If two numbers are in the middle then take the average of those

How well did you know this?

Not at all

Perfectly

Mode

Most common number in the dataset

How well did you know this?

Not at all

Perfectly

Range

The difference between the highest and lowest values
Formula = maximum - minimum

How well did you know this?

Not at all

Perfectly

Interquartile range

Def: the range of the middle 50% of data between Q1 and Q3

Formula IQR = Q3-Q1

How well did you know this?

Not at all

Perfectly

SD - what is standard deviation

Measure of how much data is deviated from the mean

Pro is uses all the data points

How well did you know this?

Not at all

Perfectly

What is normal distribution

If mean median and mode are all the same so it’s symmetric
Follows the 68, 95,99.7 rule

68% of data whitin 1 SD from mean
95% within 2 SD
99.7% within 3 SDs

Ex. ID scored mean 100, SD 15
68% of ppl have IQ 85-115
95% have IQ 70-130

How well did you know this?

Not at all

Perfectly

What does it mean if you have a large and small SD

Small SD =data concentrated around mean , long bell
Larger SD=wider spread of data away from the mean ,wide bell

How well did you know this?

Not at all

Perfectly

Right skewed distribution (positive )

Ex. Income many earn 40 grand but ear millions so would pull the mean higher than median

Mode median mean

How well did you know this?

Not at all

Perfectly

Left skewed example

Negative
Ex age of retirement

Mean median mode

How well did you know this?

Not at all

Perfectly

Parametric vs non parametric

Study These Flashcards

In clinical research if uncertainly about the distribution of a test
Which one do you use

Study These Flashcards

Use a non parametric test

What is the non parametric version of the
unpaired t test or
independent t test or
students T test

Study These Flashcards

Mann U Whitney

Compared 2 independent samples from the same population

ex. Compare average time of del between kiwi and forceps

What is the non parametric of

One sample T test or
One sample paired T test

Study These Flashcards

Wilcoxon matched pairs t test

Compared 2 sets of observations on the single sample (before and after)

What is the non parametric version of

ANOVA - one way analysis of variants using total sum of squares

Study These Flashcards

Kurskall-Wallis

Compared 3 or more sets of observations on a single sample paired T (compare decision -del time in the end stage of labour for vetouse, forceps and kiwis)

What is the non parametric version of

Chi-square test

Study These Flashcards

Fisher’s exact test for <10 number , smaller number
Chi square is for >/+ 10 sample size

used to determine whether there is a significant association between two categorical variables.

⸻

✅ What does it do?

The Chi-square test checks whether the observed frequencies in a dataset differ significantly from what we’d expect by chance.

For example: relationship between obese mom ( obese / non obese) and PET (present / absent )

What is the non parametric version of

Pearson correlation coefficient

Study These Flashcards

Spearman’s rank

Assess the stregth of the straight line association between 2 continuous variables

Ex. If HBA1c is related to birth weight in a diabetic mom

What multiple logistic regression

Study These Flashcards

Calculates the relationship between variables eg. Birth weight and several independent or predictor variables like age, smoking , parity

What do inferential statistics include

Study These Flashcards

Hypothesis testing (null and alternative hypotheses)
P-value and significance - typically p<0.05
Confidence intervals (CI) and their interpretation

What is hypothesis testing

It’s a statistical method used to make decisions based on data It helps determine whether an observed effect is real or just due to random chance Ex. A drug company makes a new drug and they wanna test it if it works better than the current treatment Null hypothesis = the new drug is not more effective Alternative hypothesis= the new drug is more effective than the standard drug After the study we analyze the data to decide if we reject the null hypothesis ( supporting the new drug) or to fail the reject the null H (no sufficient even dense that its better)

In hypothesis testing there are 2 types of errors What is a type 1 error

FP Occurs when we reject a true null hypothesis We think we found an effect but there is actually none Probability alpha significance level

What is a type II error

FN Occurs when we fail to reject a false null hypothesis We fail to detect an effect that actually exists Beta

P - value

The p-value (or probablility value) is a measure used in statistical hypothesis testing to determine the significance of results. It helps answer the question “how likely is that we would observe our data if the null hypothesis were true? “ Threshold (alpha significance level) A common threshold is 0.05 (5%) If p< or = 0.05 results are considered statistically significant If p> 0.05 there is not enough evidence to reject the null hypothesis Misconceptions: - a low P value does not prove the alternative hypothesis is true . It just suggests that the null hypothesis is unlikely - a high p value does not prove the null hypothesis is true . It just suggests a lack of strong evidence against it - P value is not the probability that the null hypotheisis is true . It measures how compatible the data is with the null hypothesis Example You are testing whether a new drug is effective compared to a placebo. Your null hypothesis is that the drug has no effect After running the test you get p=0.003, this is <0.05 so you reject the null , so there is stron evidence that the drug has an affect

Confidence internal

A range of values that likely contains the true population parameter eg. Mean proportion with certain level of confidence CI- a 95% CI means that if we repeated the sampling process many times 95% of the time the true parameter would fall within the interval A 99% confidence interval would be wider but give more certainty Interpreting 95% CI ex. 100+/-5 -If we estimate the average height of a poplulation and get (95,105) as the confidence interval . It means we are 95% confident that the true population mean fall between 95 and 105

Study designs and their statistical considerations

RCT Cohort studies Case control studies Cross sectional studies Systematic review and meta-analyses

RCT

Gold standard for **causation** Can be single or double blind

Statistics Flashcards

(31 cards)