Lecture 5 Flashcards
(20 cards)
What is biostatistics?
The application of statistical methods to biological, medical, and public health research for designing studies, analyzing data, and drawing valid conclusions.
What is the difference between a population and a sample?
A population is the entire group under study, while a sample is a smaller group selected from the population to represent it.
What are the main data types in biostatistics?
Categorical:
- Nominal (no order; e.g., blood type)
- Ordinal (ordered; e.g., disease severity)
Numerical:
- Discrete (countable; e.g., hospital visits)
- Continuous (measurable; e.g., height)
What is the difference between descriptive and inferential statistics?
Descriptive: Summarizes data (mean, median, mode, SD, etc.)
Inferential: Makes predictions or inferences about a population using a sample
What are the measures of central tendency?
Mean: Average
Median: Middle value (best for skewed data)
Mode: Most frequent value (best for categorical data)
What are the measures of dispersion?
Range: Max - Min
Variance: Average of squared deviations from the mean
Standard Deviation (SD): Square root of variance
Interquartile Range (IQR): Q3 - Q1, middle 50% of the data
What are common data visualisation tools?
Histogram: Shows distribution of numerical data
Box Plot: Shows median, quartiles, and outliers
Bar Chart: Compares categorical data
Scatter Plot: Shows relationship between two numeric variables
What is the purpose of inferential statistics?
To draw conclusions about a population from sample data using probability-based methods.
What is a hypothesis in statistics?
Null (H₀): No effect/difference
Alternative (H₁): There is an effect/difference
What is a p-value?
The probability of observing the data (or something more extreme) assuming the null hypothesis is true.
When is a result considered statistically significant?
When the p-value is less than the significance level (usually α = 0.05).
What is a confidence interval (CI)?
A range around a sample statistic that likely contains the true population parameter. A 95% CI means 95 out of 100 such intervals would capture the true value.
What is a t-test and when is it used?
Compares means of two groups:
Independent t-test: Between two separate groups
Paired t-test: Before/after in same group
What is a chi-square test used for?
Analyzing relationships in categorical data:
- Goodness-of-fit: Observed vs expected frequencies
- Independence: Association between two categorical variables
What is ANOVA used for?
Comparing means of three or more groups.
What is correlation?
A measure of the linear relationship between two numeric variables. Range: -1 (perfect negative) to +1 (perfect positive).
What is regression analysis?
Models the relationship between a dependent variable and one or more independent variables (used for prediction).
What questions should you ask when appraising a study?
Is the study design appropriate?
Was the sampling method representative?
Are the statistical methods appropriate?
Are results both statistically and clinically significant?
What is the difference between statistical and practical significance?
Statistical significance: Based on p-values
Practical significance: Real-world impact or meaningfulness of the effect
How do you effectively communicate statistical findings?
Use plain language, define key terms (e.g., p-value, CI), avoid jargon, include both numerical results and their interpretation.