Biostatistics Flashcards
Adjusted Rate
A summarizing procedure for a statistical measure in which the effects of differences in composition of the populations being compared have been minimized by statistical methods. E.g. Adjustment by regression analysis & by standardization. Performed on rates or relative risks, commonly because of differing age distributions in population that are being compared. Mathematical procedure used to adjust rates for age differences is direct or indirect standardization.
alpha
The probability of a type I error, the error of rejecting a true null hypothesis. i.e. declaring a difference exists when it does not. Wrongly reject the null
Type I error
Alpha. Rejecting the null when it is true.
alternative hypothesis
A supposition, arrived at from observation or reflection, that leads to refutable predictions. ANy conjecture cast in a form that will allow it to be tested and refuted.
analysis of covariance (ANCOVA)
Used for an extention of ANOVA that alllows for possible effects of continuous concomitant variables (covariates) on the response variable, in addition to the effects of the factor or treatment variables. Usually assumed that covariates are unaffected by treatments and that their relationship to the response is linear. If such a relationship exists then inclusion of covariates in this way decreases the error mean square and hence term now appears to also be more generally used for almost any analysis seeking to assess the relationship between a response variable and a number of explanatory variables.
analysis of variance (ANOVA)
The separation of variance attributable to one cause from the variance attributable to others. By partitioning the total variance of a set of observations into parts due to particular factors, for example, sex, treatment group, etc, and comparing variances (mean squares) by way of F-tests, differences between means can be assessed. The simplest analysis of this type involves a one-way design, in which N subjects are allocated, usually at random, to the k different levels of a single factor. The total variation in the observations is then divided into a part due to differences between level means (the between groups sum of squares) and a part due to the differences between subject in the same group (the within groups sum of squares, or the residual sum of squares). *SEE ANOVA TABLE. If the means of the populations represented by the factor levels are the same, then within the limits of random variation, the between groups mean square and within groups mean square, should be the same. Whether
Bayes’ theorem
A procedure for revising and updating the probability of some event in the light of new evidence. Originates in a essay by the REverend Thomas Bayes. *SEE EQUATION
Beta
The probability of a type II error, the error of failing to reject a false null hypothesis i.e. declaring that a difference does not exist when in fact it does. Failing to reject the null when you should reject the null
Type II error
Beta. Failing to reject the null when you should
Bias
In general terms, deviations of results or inferences from the truth, or processes leading to such deviation. More specifically, the extent to which the statistical method used in a study does not estimate the quantity thought to be estimated, or does not test the hypothesis to be tested. In estimated usually measured by the difference between a parameter estimate and its expected value. *SEE EQUATION
Binary variable (binary observation)
Observations which occur in one of two possible states, these often being labeled 0 and I. Such data is frequently encountered in medical investigations; commonly occurring examples include ‘dead/alive’, ‘improved/not improved’ and ‘depressed/not depressed’. Data involving this type of variable often require specialized techniques for their analysis such as logical regression
binomial distribution
The distribution of the number of ‘successes’, X in a series of n-independent Bernoulli trials where the probability of success at each trial is p and the probability of failure is q=1-p. Specifically the distribution is given by *SEE EQUATION
Biostatistics
A branch of science which applies statistical methods to biological problems. Encompasses the design of biological experiments, especially in medicine and health sciences.
bivariate
outcomes belong to two categories, e.g. yes/no, acceptable/defective “bivariate binomial distribution”
Blinded study
A procedure used in clinical trials to avoid the possible bias that might be introduced if the patient and/or doctor knew which treatment the patient is receiving. Clinical trials should use max degree of blindness that is possible, although in some areas it is impossible. e.g. surgery
Double-blind
If neither the patient nor doctor are aware of which treatment has been given.
SIngle-blind
only one of the patient or doctor is unaware
Bonferroni correction
A procedure for guarding against an increase in the probability of a type I error when performing multiple significance tests. TO maintain the probability of a type I error at some selected value each of the m tests to be performed is judged against a significance leve (a/m). FOr a small number of simultaneous tests (up to 5) this method provides a simple and acceptable answer to the problem of multiple testing. It is however highly conservative and not recommended if large numbers of tests are to be applied, when one of the many other multiple comparison procedures available is generally preferable.
case-control study
The observational epidemiologic study of persons with the disease or other outcome variable of interest and a suitable control (comparison, reference) group of persons without the disease.
Categorical data
Represent types of data which may be divided into groups. Examples of categorical variable are race, sex, age group, and educational level. While the latter two variables may also be considered in a numerical manner by using exact values for age and highest grade completed, it is often more informative to categorize such variables into a relatively small number of groups.
censored observation
AN observation (Xi) in some variable of interest is said to be censored if it is known only that Xi-=Li (left-censored) or Xi=Ui (right-censored) where Li and Ui are fixed values. SUch observations arise most frequently in studies where the main purpose variable is time until a particular event occurs (for example, time to death) when at the completion of the study, the event of interest has not happened to a number of subjects.
Central Limit Theorem
If a random variable Y has population mean and population variance, then the sample mean, y bar, based on n observations, has an appropriate normal distribution with a mean and variance/n, for sufficiently large n. The theorem occupies an important place in statistical theory. IN short, the Central Limit Theorem states that if the sample size is large enough, the distribution of sample means can be approximated by a normal distribution, even if the original population is not normally distributed.
Chi-Square Distribution
based on a normally distributed population with variance o2, with randomly selected independent samples of size n and computed sample variance s2 for each sample. The sample statistic X2=(n-1)s2/o2. The chi-square distribution is skewed, the values can be zero or positive but not negative, and is different for each number of degrees of freedom. Generally, as the number of degrees of freedom increases, the chi-square distribution approaches a normal distribution.
Chi-Square Statistic
A statistic having, at least approximately, a chi-squared distribution