lecture one Flashcards Preview

biostatistics > lecture one > Flashcards

Flashcards in lecture one Deck (26)
Loading flashcards...

Definition of biostatistics

The science of collecting, organizing, analyzing, interpreting and presenting data for the purpose of more effective decisions in clinical context.


4 Importance of biostatistics


  1. Identify and develop treatments for disease and estimate their effects
  2. Design, monitor, analyze, interpret, and report results of clinical studies
  3. Identify risk factors for diseases
  4. Develop statistical methodologies to address questions arising from medical/public health data


When do you need biostatistics?

BEFORE you start your study!


dx betw population and a sample

  • Population includes all objects of interest
    • assoc w/ PARAMETERS(μ,σ)
  • Sample is only a portion of the population
    • assoc w/ STATISTICS(X,s)


compute statistics, and use them to estimate


reasons why we dont work with populations

  1. usually large, and often impossible to get
    data for every object of study

  2. Sampling is costly, and the
    more items surveyed, the larger the cost


Descriptive vs Inferential statistics

statistics are computed in order to estimate the parameters of a population 

Descriptive Statistics

  • first (computational) part of statistical analysis 
  • procedure used to organize and summarize masses
    of data

Inferential Statistics i

  • second (estimated) part of statisticcal analysis
  • used to find out info about a population, based on a sample


define biased sample 

Biased sample is one in which the method used to
create the sample results in samples that are
systematically different from the population




define random sampling 

Each element/item in the population has an equal chance of


  • preferred way but difficult to execute 
  • requires complete list of each element in pop therefore usually assoc w/ comp gen list


define systematic sampling 

elements are counted off /every x-th element is taken 



convenience sampling:

 readily available data is used (first people the surveyor runs into.)


cluster sampling: 

  • divides the pop into groups/clusters usually geographically.
  • clusters are randomly selected,
  • each element in the chosen clusters are used.


Stratified sampling

  • divides the population into groups called strata.
  • by some characteristic,(M/F) not geographically
  • sample taken from each strata using
    • random, systematic, or convenience sampling.


3 things thtat determine a good sample 

  1. Random selection

  2. Representativeness by structure

  3. Representativeness by number of cases


types of error

Random error = sampling variability.

Bias (systematic error) difference betw/ observed value and the true value due to all causes other than sampling variability.


absence of error of all kinds = accuracy


sample size calculation principles

  • law of large numbers= as the sample size increases the margin of error decreases as percentage diff betw/ popo and sample goes to zero
  • number of experimental units are justified 
  • purpose of size calculation = large enough for acc info but small enough  for practicality 


factors sample size depend on APEUS

Acceptable level of confidence

Power of the study

Expected effect size

Underlying event rate in the population

Standard deviation in the population


stages of biomedical research 

  1. Planning and organization

  2. Conduction of the investigation

  3. Data processing and analyses of results


8 components of research programme 

  1. Aim : summary and
    formulatation of the research hypothesis.

  2. Object: event, that is going to be studied.

  3. Units of observation: logical(studied case) and technical(evn of logical unit

  4. Indices of observation: important mesaurable factors. they are *measurable*additive*self controlling

    1. factorial

    2. resultative

  5. Place

  6. Time

    1. single: studied in single "critical' moments

    2. continuous: show long term tendency of events

  7. Statistical analyses

  8. Methodology


one vs many measurements

many measurements on one subject:                                                   get to know the one subject quite well but learn nothing about how the response varies across subjects.

one measurement on many subjects,                                                   you learn less about each individual, but you get a good sense of how the response varies across subjects.


 explain paired and unpaired data 

paired: 2+ measurements are made on the same observational unit (subjects, couples)


unpaired: only one type of measurement is made on each unit.


describe the parts of the research plan in planning and organisation 

  • Definition of the team responsible for the study and preliminary training.

  • Administration and management of the study.


key components of information processing

  • Data check and correction

  • Data coding

  • Data aggregation

    • according to data use: Primary /Secondary

    • according to indice number: Simple /Complex


benefits of data summary

  • become familiar with the data and the characteristics of the sample that you are studying
  • identify problems with data collection or
    errors in the data (data management issues)
  • Range checks for illogical values


dx betw/  variable and data

variable: ​something whose value can vary.

data: values obtained from measuring a variable


categories of variables


  • Values in arbitrary categories

  • Order of the categories is completely arbitrary/ meaningless 

  • No units!

  • Data has no units of measurement.



  • Values in ordered categories

  • Order of the categories is not arbitrary. possible to
    order the categories in a meaningful way.

  • No units!

  • Data do not have any units of measurement.


what are the four levels of measurement

  • Nominal is the lowest level. Only names are meaningful here.
    • genotype- no numbers calc is meaningless
  • Ordinal adds an order to the names.
    • pain score-order matters but not diff betw
  • Interval adds meaningful differences.
    • temp- diff betw 2 points
  • Ratio adds a zero so that ratios are meaningful.
    • height- clear def of 0, can look at the ratio of 2 measurements


visual methods to summarize data

  • tables
    • frequency table
  • graphs/ graphical summary
    • bar chart = categorical data
    • histogram = continuous data
    • boxplot= continous data