BIOSTATISTICS Flashcards
(50 cards)
What is statistic plus history
A branch of mathematics concerned with collecting and interpreting data
Also a tool for prediction and forecasting using data and statistical models.
Thought to be made in 1662 by john graunt then developed in the 17th century
What are the 2 kinds of statistics
descriptive statistics
inferential statistics
descriptive statistics
summarize the population data by describing what was observed in the sample numerically or graphically.
numerical descriptors include mean , standard deviation for continuous data types ( like height or weight )
it explains what
inferential statistics
deals with generalization of information . Inference is the principle of thinking when we go from concrete info acquired by observations and measurements of samples to general rules that are valid for the whole population.
it explains why
what is applied statistics
it is statistics and inferential statistics applied
deductive inference
We hold a theory and based on it we make a prediction of its consequence , we predict what the observation should be.
Inductive inference
we go from specific to the general . We make many observations to discern a pattern , make a generalization and infer an explanation.
population
a collection of individuals which we may be interested , which have something in common.
mostly not possible to look at a whole pop we usually take a sample
sample
a group of individuals taken from a larger population and used to find out something about that popualtion.
you need it to be representative of the population and have the characteristics in proportion
Random sampling
pick them randomly
simple random sampling - random number generator
systematic random sampling - random from the system using a sampling frame
multi stage sampling - constructed by taking a series of simple random samples in stage.
for example take a sample of children between 10 – 12 years. We divide the population in several hierarchically arranged stages towns – schools – classes – pupils and then we randomly take a few of elements from the highest stage (towns), and from these we randomly chose a few of elements from the lower stage (schools) etc.
stratified random sampling - divide them into different strata , age groups , sex etc then take a random sample from within the strata in order to obtain a sample that is representative
types of data
qualitative - nominal + ordinal
quantitative - interval + ratio
graphical presenting data
data
the indications produced by observation and measurement .
qualitative data
individuals may fall into seperate classes
nominal - assumes its possible to clearly decide whether any 2 objects are the same or different .
e.g sex of a person ,employment , colour of eyes , blood group
oridnal data - possible to clearly decide whether 2 objects are the same or different in surveyed characteristic , also possible to determine its rank .
e.g the intensity of pain , seriousness of diabetes mellitus , school classification
quantitative data
the metric scale is the unit of measurement is determined unambiguously . but at the start of the scale is not always determined .
- numerical
split into discrete ( values are integer ( no of teeth )
and continuous ( values can take any number in range e,g time , weight , height )
- ratio data
e.g the number of cars produced last year , capacity of lungs , number of blood elements
- interval data
the stat is not determined uniquely
e.g measurement of temp on Celsius or Fahrenheit scale ?
graphical presenting data
convenient to convey by diagrams , but they can be misleading should only be used in addition to numbers not a replacement
chart types for graphical presenting data
scatter plot ( 2 quantitative variables )
line graph ( how a variable changes over time )
bar chart ( shows absolute + relative freq of values )
age sex pyramid ( distibution of various age groups in pop)
histogram ( freq distribution )
pie chart ( shoes relative freq for each category)
box and whisker plot ( shows distance between quartiles )
probability rules
Suppose that two events A, B are mutually exclusive, i.e. when one happens the other cannot happen (symbolically P(A and B) = 0).
Then the probability that one or the other happens is the sum of their probabilities.
Symbolically P(A or B) = P(A) + P(B) - Additional rule.
For example, the throw of a dice may show a one or a two, but not both. The probability that it shows a one or a two = 1/6 + 1/6 = 2/6 = 1/3.
Mutually exclusive: cannot happen at the same time.
If A, B are not mutually exclusive, symbolically P(A and B) ≠ 0, then P(A or B) = P(A) + P(B) - P(A and B).
condtional probability
QUESTION TO TRY
In a random sample of 140 men aged 40-50 years and suffering from hypertension a presence of the risk factor „hypercholesterolemia“ (event A) occurred in 37 patients and the risk factor „smoking“ (event B) in 98 patients. 31 patients had both risk factors. Estimate the probabilities of the following events A, B, C = (A and B) and D = (A or B). Use relative frequencies.
Estimate the conditional probability of the occurrence of „hypercholesterolemia“ (event A) given that the event „smoking“ (event B) occurred (P(AôB)).
Verify the independence of events „hypercholesterolemia“ (event A) and „smoking“ (event B).
Example 1
Suppose we draw a card from a deck of playing cards.
What is the probability that we draw a spade?
Example 2
Suppose a coin is flipped 3 times. What is the probability of getting two tails and one head?
Solution 1 : The sample space of this experiment consists of 52 cards, and the probability of each sample point is 1/52. Since there are 13 spades in the deck, the probability of drawing a spade is P(Spade) = (13)(1/52) = 1/4
Solution 2 : For this experiment, the sample space consists of 8 sample points.
S = {TTT, TTH, THT, THH, HTT, HTH, HHT, HHH}
Each sample point is equally likely to occur, so the probability of getting any particular sample point is 1/8. The event “getting two tails and one head” consists of the following subset of the sample space.
A = {TTH, THT, HTT}
The probability of Event A is the sum of the probabilities of the sample points in A. Therefore, P(A) = 1/8 + 1/8 + 1/8 = 3/8
Example 3
A coin is tossed three times. What is the probability of getting three tails?
Example 4
An urn contains 6 red marbles and 4 blue marbles. Two marbles are drawn without replacement from the urn. What is the probability that both of the marbles are blue?
Solution 3
If you toss a coin three times, there are a total of eight possible outcomes. They are: HHH, HHT, HTH, THH, HTT, THT, TTH, and TTT. Of the eight possible outcomes, one has three tails (TTT). Therefore, the probability of getting three tails is 1/8.
Solution 4 : Let A = the event that the first marble is blue; and let B = the event that the second marble is blue. We know the following:
In the beginning, there are 10 marbles in the urn, 4 of which are blue. Therefore, P(A) = 4/10.
After the first selection, there are 9 marbles in the urn, 3 of which are blue. Therefore, P(B|A) = 3/9.
Therefore, based on the rule of multiplication (for dependent events):
P(A and B) = P(A) P(B|A)
P(A and B) = (4/10)*(3/9) = 12/90 = 2/15
Example 5
A student goes to the library. The probability that she checks out a work of fiction is 0.40, a work of non-fiction is 0.30, and both fiction and non-fiction is 0.20. What is the probability that the student checks out a work of fiction, non-fiction, or both?
Example 6
Of all of Dr. Smiths patients, 20 % run every day (event R), 50 % drink two glasses of milk each day (event M), and 12 % do both. What is the probability that a patient runs every day, given that the patient is known to drink two glasses of milk daily?
Solution 5 :
Let F = the event that the student checks out fiction; and let N = the event that the student checks out non-fiction. Then, based on the rule of addition:
P(F or N) = P(F) + P(N) - P(F and N)
P(F or N) = 0.40 + 0.30 - 0.20 = 0.50
Solution 6 :
P(R|M) = P(R and M)/P(M) = 0.12/0.50 = 0.24
Example 7
A card is drawn randomly from a deck of ordinary playing cards. You win $10 if the card is a spade or an ace. What is the probability that you will win the game?
Solution
Let S = the event that the card is a spade;
and let A = the event that the card is an ace.
We know the following:
There are 52 cards in the deck.
There are 13 spades, so P(S) = 13/52.
There are 4 aces, so P(A) = 4/52.
There is 1 ace that is also a spade, so P(S ∩ A) = 1/52.
Therefore, based on the rule:
P(S ∪ A) = P(S) + P(A) - P(S ∩ A)
P(S ∪ A) = 13/52 + 4/52 - 1/52 = 16/52 = 4/13
A
C
D
B
C
C
C
C
A
D