Biostatistics Flashcards
What is statistics?
Statistics involves the collection and analysis of all types of data
What is biostatistical analysis or biostatistics?
When statistics are used to understand the effects of a drug or medical procedure on people and animals, the statistical analysis is called biostatistical analysis or biostatistics
What is the path to publication for the classic type of research study?
Begin with a research question, design the study, enroll the subjects, collect the data, analyze the data and publish
What happens after a study manuscript is published?
A study manuscript can be submitted for publication in a professional, peer-reviewed journal. The editor of the journal selects potential publications and sends them to experts in the topic area for peer review
What is the intention of peer review?
Peer review is intended to assess the research design and methods, the value of the results and conclusions to the field of study, how well the manuscript is written and whether it is appropriate for the readership of the journal
What is the potential impact of a peer reviewed study manuscript?
The reviewers make a recommendation to the editor to either accept the article (usually with revisions) or reject it. Data that contradicts a previous recommendation, or presents new information, can change treatment guidelines
Describe the organization of a published clinical trial
A published clinical trial begins with an abstract that provides a brief summary of the article. The introduction to the study comes next, which includes background information, such as disease history and prevalence, and the research hypothesis. This is followed by the study methods, which describe the variables and outcomes, and the statistical methods used to analyze the data. The results section includes figures, tables and graphs. A reader needs to interpret basic statistics and common graphs in order to understand the study results. The researchers conclude the article with an interpretation of the results and the implications for current practice
What is continuous data?
Continuous data has a logical order with values that continuously increase (or decrease) by the same amount. Data is provided by some type of measurement which has unlimited options (theoretically) of continuous values
What is the two types of continuous data and what is the difference between them?
The two types of continuous data are interval data and ratio data. The difference between them is that interval data has no meaningful zero (zero does not equal none) and ratio data has a meaningful zero (zero equals none)
What is discrete (categorical) data and what are the two types?
The two types of discrete data, nominal and ordinal, have categories, and are sometimes called categorical data. Data fits into a limited number of categories
What is nominal data?
With nominal data, subjects are sorted into arbitrary categories and order of categories does not matter
What is ordinal data?
Ordinal data is ranked and has a logical order. Ordinal scale categories do not increase by the same amount
What are descriptive statistics and what are the typical descriptive values?
Descriptive statistics provide simple summaries of the data. The typical descriptive values are called the measures of central tendency, and include the mean, the median and the mode
What is the mean?
The mean is the average value and is calculated by adding up the values and dividing the sum by the number of values. The mean is preferred for continuous data that is normally distributed
What is the median?
The median is the value in the middle when the values are arranged from lowest to highest. When there are two center values (as with an even number of values), take the average of the two center values. The median is preferred for ordinal data or continuous data that is skewed (not normally distributed)
What is the mode?
The mode is the value that occurs most frequently. The mode is preferred for nominal data
What are the two common methods of describing the variability?
Range and standard deviation
What is the range?
The range is the difference between the highest and lowest values
What is the standard deviation?
Standard deviation indicates how spread out the data is, and to what degree the data is dispersed away from the mean. A large number of data values close to the mean has a smaller SD, Data that is highly dispersed has a larger SD.
What do large sample sets of continuous data tend to form?
A Gaussian or “normal” (bell-shaped) distribution
Describe the curve of the Gaussian distribution when the distribution of data is normal.
When the distribution of data is normal, the curve is symmetrical (even on both sides), with most of the values closer to the middle. Half of the values are one the left side of the curve, and half of the values are on the right side. A small number of values are in the tails.
Describe the data when it is normally distributed with the Gaussian distribution?
- The mean, median and mode are the same value and are at the center point of the curve
- 68% of the values fall within 1 SD of the mean and 95% of the values fall within 2 SDs of the mean
Describe how the curve of normally distributed data changes based on the spread (or range) of the data
The curve gets taller and skinnier as the range of data narrows. The curve gets shorter and wider as the range of data widens (or is more spread out).
What happens when the data is skewed?
Data that are skewed do not have the characteristics of a normal distribution; the curve is not symmetrical, 68% of the values do not fall within 1 SD from the mean and the mean, median and mode are not the same value
*This usually occurs when the number of values (sample size) is small and/or there are outliers in the data