Flashcards in Chapter 7 - Hypothesis Testing (One-Sample) Deck (18):

1

## Hypothesis Testing

- framework for making decisions using probabilistic methods

- provides a uniform decision-making criterion

- one-sample = hypotheses are specified about a single distribution

- two-sample problem = two different distributions are compared

2

## Type I error

- probability of rejecting the null hypothesis when H0 is true

- it is referred to as the significance level of a test

- a test based on sample mean has the highest power among all tests with a given type I error

3

## Type II error

- probability of accepting the null hypothesis when H1 is true

- denoted by B

- function of the population mean

4

## Acceptance region

5

## Rejection region

6

## One-sided test

7

## Critical value method

8

## p-value method

- the p-value is the significance level at which the given value of the test statistic is on the borderline between the acceptance and rejection regions

- can also be thought of as the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained, given that the null hypothesis is true

9

## Statistical significance of a p-value

- if 0.01 < p < 0.05, then the results are significant

- if 0.001 < p < 0.1, then the results are highly significant

- if p < 0.001, then the results are very highly significant

- if p > 0.05, then the results are considered not statistically significant

- however, if 0.05 < p < 0.10, then a trend towards statistical significance is sometimes noted

10

## Statistical vs. scientific significance

- the results of a study can be statistically significant, but still can be not scientifically important

- eg. if a small difference was found to be statistically significant because of a large sample size

- some statistically non significant results can be scientifically important

- encourages researchers to perform larger studies

11

## Two-tailed test / Two-sided test

- the values of the parameter being studied under the alternative hypothesis are allowed to be either greater than or less than the values of the parameter under the null hypothesis

- the type I error is divided evenly between lower and upper rejection regions

12

## p-value for two-sided tests

13

## Decision between one-sided and two-sided

- it is easier to reject the null hypothesis using a one-sided test than using a two-sided test

- a two-sided test can be more conservative because it is not necessary to guess the appropriate side - - in some cases, only alternatives on one side of the null mean are of interests or are possible

- a one-sided test is better than a two-sided test because it has more power

- do not change from a two-sided to a one-sided test after looking at data

14

## Relationship between confidence intervals and two sided tests

- H0 is rejected with a two-sided test only if the two-sided CI for the mean does not contain the null mean

- H0 is accepted with a two-sided test only if the two-sided CI for the mean does contain the null mean

15

## Power of a test

- the calculation of power is used to plan a study, usually before any data have been obtained (exception = pilot study)

- can make a projection concerning the standard deviation without actually having any data to estimate it

- assume the standard deviation is known and base power calculations on the one-sample z test

16

## Factors affecting the power of a test

- if the significance level is made smaller, the power decreases

- if the alternative mean is shifter farther away from the null mean, then the power increases

- if the standard deviation of the distribution of individual observations increases, then the power decreases

- if the sample size increases, then the power increases

17

## Factors affecting the required sample size

- the sample size increases as the variance increases

- the sample size increases as the significance level decreases

- the sample size increases as the required power increases

- the sample size decreases as the absolute value of the distance between the null and alternative mean increases

18