EBM4 Hypothesis testing Flashcards Preview

Foundations Part II > EBM4 Hypothesis testing > Flashcards

Flashcards in EBM4 Hypothesis testing Deck (32)

How did we start this lecture

She started the lecture by showing us a poster in UK that said that Ocs have a 100% increase risk of VTEs in 3rd generation compared to the ones in older generations. This lead to many unwanted pregnancies in UK.Then she showed us the math that this 100% increase is an exaggeration as this is only a relative risk increase. In these cases sometimes it is worthwhile to investigate the absolute risk increase. She pulled numbers up from the studies and it was revealed that the absolute risk is quite small, only 0.014%, so about 7000 women would have to take the new generation Ocs to cause 1 case of VTE (this was the number needed to harm)


"Third generation Ocs are associated with twice the risk of VTEs (100% increase) compared to older products". Identify what aspects of EBM are being discussed here

Risk can be associated with risk ratio and the increase can be termed as relative risk increase.


What are the two kinds of variables

dep and inde, exposure and outcome


Know how to state null hypothesis in terms of the context of a specific study.

We can have Null hypothesis also for continuous variable for which we look at the mean of a population (such as age, BMI etc).


Why do we make 2 sided alternate null hypothesis

to allow variation on either side of the hypothesis


What are parametric methods

The idea of parametric methods requires us to make assumptions regarding the underlying distribution of the data (such as normal bell shaped curve distribution and having a sufficiently large sample size). Two tools of stats that involves parametric methods are:
1. T tests
2. Anova


What is Chi squared test used for

Chi square test is used to compare proportions between two or more than two different groups. This is for categorical data only (examples blood type, states in they live, ethnicity).


What is t test and ANOVA used for

T test is used to compare the mean values between two (and only 2) different groups. This obviously is used for numerical data only (Probably can also be used for continuous data).
ANOVA is defined as the analysis of variance. This is used for analysis of data that includes multiple variables. This is usually employed to compare more than 2 variables between more than 2 groups. Also this can be used for mean values.
ANOVA can do what t tests can but t test cannot do what ANOVA can.


Factors that go into 0.05

0.05 is not a “magic” number, must take into account measure of association.


What is Z score

Z score which is a measure of the standard deviation.

All of these tests produce test statistics: this is a quantitative estimate of exposure outcome relationship between 2 groups


What are the graphs of Z scores

She also showed us a graph of Z scores and their bell shaped distribution. She then pointed out that if the tests statistics are more than or equal to or less than or equal to -1.96 and 1.96 (if they lie within this range) then the test is statistically significant.

Another way to say this is that if the data collected is more than or less than the critical value (-1.96 and 1.96) then p


What does the p value tells us

A p-value doesn’t tell us if the null hypothesis is correct or not. A low p-value means that the data collected are not very consistent with the null hypothesis. A high p-value indicates that the data are reasonably consistent with the null hypothesis. Remember that a p-value is calculated on the assumption that the null hypothesis is true.


Define P value

the probability that an association is at least as strong as observed (i.e as strong or stronger) might have arisen by chance alone (if the null hypothesis was true).

In other words this means that the p value is a measure of relative consistency between the null hypothesis and the data collected.

NOTE: P value is not the probability that the hypothesis is correct.


What happens when the p value is small

If p value is small, we can reject the null hypothesis and we can consider the findings of the experiment as statistically significant.


What are chi squared test and t tests used for

Categorical data and numerical data


What is random error

Error due to chance


What is systematic error

Error due to bias


Fail to reject null hypothesis:

Reject null hypothesis:

Fail to reject null hypothesis: Conclude that there is no difference between the 2 treatment groups
Reject null hypothesis: Conclude that data are not consistent with null hypothesis and favor an alternative hypothesis


Define type I and type II error. What should be done to reduce each of these error

Remove type I by replicating studies, remove type II by carefully designing the studies or having more study power (it means a bigger sample size).

Type I error is when there is no association between the control and exposed group but we still find an association with them and fail to reject the null hypothesis. This error is also called alpha. An alpha value has to be decided before the test.
Almost always the alpha value is 0.05 which corresponds with the p value.
Type I error: You say groups are different, but in reality they are the same.

Type II error is beta where you reject the null hypothesis but there was some kind of associated between the 2 groups.
Type II error occurs due to inadequate study power. Study power is associated with study size. We would see type II error if we don’t have enough of a big size.
Can occur due to faulty methodology, chance
To avoid these errors the study design should be very carefully made.


Why is beta so much bigger than alpha?

Because beta changes. Alpha keeps the things the same.


Study power

It is the ability of a study to detect a true difference between groups. This is calculated as 1-beta so if beta is 20% then the power will be 80% which means there is an 80% chance of detecting specific differences between the treatment groups.


How do we calculate sample size for a study

Step 1 is to determine the study design, measure of frequency and measure of effect, Step 2 is set alpha and beta levels, step 3 is to determine the magnitude of difference in outcomes between 2 groups that the study will be designed to detect and step 4 is to calculate the sample size.

Usually this involves looking at the older study to find some of these numbers.
Important things to note:
The larger the difference in the control and exposed group, the smaller the sample size we need.
If alpha is made smaller, then the sample size has to be larger
For beta think it in terms of study power, if study power is made smaller then we need a smaller sample size.


4 steps we do in the process of hypothesis testing

1. Measure of frequency comparing the outcome in 2 groups
2. P value
3. Point estimate
4. Confidence interval


Details about the Chi square test

Chi-square test statistics are always positive (due to squaring), but tests are still 2-tailed. Given that the null hypothesis is true, the probability that an association at least as strong as that observed might have arisen, given that there is no difference between the two groups (null hypothesis), is 3%.

When you conduct a chi-square test, you generate a test statistic that is calculated given an underlying chi-square distribution. The p-value corresponding to the test statistic can be found on a look-up table in any statistics text book.


Point estimates and confidence interval

Point estimate is the single best numerical estimate of effect from a set of data. 95% of the times our point estimate will be in the confidence interval.
The results we receive in any study do not perfectly mirror the overall population and the confidence interval lets us get a better idea of what the results in the overall population might be. If the clinical trials are repeated, the value will be between the upper and lower bounds of confidence interval 95% of the time.


If we can get only 1 of the p value, point estimate and CI then which one should we choose?

It should be the CI since it allows us to do the hypothesis testing. It give us a range of point estimate and it also kind of tells us a p value.


What happens to the CI when we increase the sample size? What is the equation?

The width of the confidence interval decreases with an increasing sample size (n). This is sort of like the standard deviation decreasing with an increased sample size. Know the equating for confidence iterval which is 1.96+ and - standard deviation divided by the square root of sample size.


Central limit theorum

The more people we get into the study the more likely we will have a normal distribution.


Analyze what you randomize

Intention-to-treat: “Analyze what you randomize” and leave patients in original randomization group.
Know the concept of analyze what you randomize: analyze the people in the manner on whatever group you assign them to regardless of the follow up. Somehow this is also associated with the NNT.

I think this basically means that we should include all participant in analysis regardless of the follow up. This increases generalizability and you can correlate this better with the real world.


What are the 3 different kinds of analysis

Simple analysis is the analysis of the crude data to examine between the exposure and the outcome only whereas sometimes we do stratified analysis where we analyze exposure and outcome in only subgroups. Then we have adjusted analysis where take into effects of other factors such as age, race, sex and then determine the exposure outcome relationship. In adjusted analysis we control for confounding.

Adjusted analysis is often times referred to a multivariate analysis.


What is confounding

Confounding is one type of systematic error.
Confounding basically means that there is an external factor that affects the exposure outcome relationship.
Remember the coffee and smoking study which was done to find out if coffee leads to pancreatic cancer. Confounding can result in false associations between the exposure and outcome groups or it can either mask true associations (works both ways).


What is the clinical significance of the p value

A statistically significant difference, no matter how small the p-value, does not mean that the difference is clinically important. A small p-value (e.g., p