Lecture notes 2 Flashcards by Kaity St. Cyr

What is type 1 error?

Rejecting a true null hypothesis. If the null hypothesis is true, then α × 100% of the time we will make a Type I error, assuming no questionable research practices or p-hacking.

How well did you know this?

Not at all

Perfectly

What is type 2 error?

Retaining a false null hypothesis. If the null hypothesis is false, then β × 100% of the time we will make a Type II error, assuming no questionable research practices or p-hacking.

How well did you know this?

Not at all

Perfectly

What is statistical power?

The probability of correctly rejecting a false null hypothesis (1 − β).

How well did you know this?

Not at all

Perfectly

What is type S error?

Rejecting a false null hypothesis but getting the direction or sign wrong. For instance, concluding that the treatment is harmful when in reality it helps. This type of error is technically a subset of statistical power (Gelman and Carlin, 2014).

How well did you know this?

Not at all

Perfectly

What is type M error?

Error in the estimation of the magnitude of the effect — e.g., getting the effect size wrong. You can think of this as the bias in the effect size estimate.

How well did you know this?

Not at all

Perfectly

What is Type C error?

A generalization of the Type I error.

The Type C error rate is the proportion of time that the confidence interval does not cover its corresponding true parameter. This error term and definition is my creation and not (yet) generally accepted. A Type C error is a coverage error. Example: You observe X bar = 4.5 with 95% CI [3.8, 5.2]. If μ = 5.5 then you have made a Type C error. If the null hypothesis is true, a Type I error will always also result in a Type C error.

How well did you know this?

Not at all

Perfectly

What is a Type 4 error?

An incorrect in- terpretation of a correctly rejected null hypothesis (Levin and Maras- cuilo, 1972).

How well did you know this?

Not at all

Perfectly

What are the 4 components of inference?

Sample size (n)
Significance criterion (α)
Effect size (e.g., δ or ρ)
Statistical power (1 − β)

How well did you know this?

Not at all

Perfectly

What does statistical inference boil down to? What is the initial assumption it is based on? What is the focus of power analysis?

what is the power of a statistical test?

how is power calculated?

Statistical inference boils down to a specified null hypothesis that is tested with a set significance criterion (α). Here we consider a very simple independent groups t-test with a treatment (t) and a control condition (c).

H0 : μt − μc = 0
H1 : μt − μc ̸= 0

This entire process is based on the initial assumption that the null hypothesis is true. If the null hypothesis is false, there is no guarantee that a sample result will be statistically significant. This is the focus of power analysis.

The power of a statistical test is the long-term (over many tests) probability that a null hypothesis will be rejected at a given sample size, α, and effect size. That is, the probability that a result will be found given that an effect of a specified size actually exists.

Power is calculated under standard statistical theory. If any three of the four components of inference are known, the fourth can be exactly specified. In other words, knowing any three allows you to exactly determine the fourth.

We normally keep the significance level set at α = .05. If we set our desired power at .80 (the level recommended by Cohen (1988) and what grant review panels will often require), and then estimate our effect size or set it at a small or medium effect size, we can then determine the sample size required to achieve that level of power.

How well did you know this?

Not at all

Perfectly

What is the level of power recommended in psych?

.80

How well did you know this?

Not at all

Perfectly

How is power impacted by a retrospective study?

This is how power calculations are typically done. Note that for a retrospective study (given an initial nonsignificant result), we estimate the effect size from the study and then calculate the statistical power to detect an effect of that size, given our current sample size. This
is not recommended and does not tell you what statistical power you actually had. Retrospective power is a monotonic func- tion of your observed p-value. For instance, if you observe a p-value of exactly .05, a retrospective power analysis will tell you that power was exactly .50. For more details on this see O’Keefe (2007).

How well did you know this?

Not at all

Perfectly

What is the use of power analysis for?

Use of power analyses is for the planning stages of a study to ensure adequate power of finding an effect of a given size. Definitely a good thing. Later this term we will spend more time examining how to adequately conduct power analyses for future studies.

How well did you know this?

Not at all

Perfectly

Explain the Central Limit theorem. When is it used? What are the 3 main points of the central limit theorem?

When a population is large (e.g., all UBC students), we can reach a number of conclusions about the sampling distribution of the sample mean, even though it is not possible to take all possible samples. How can this be? The answer is the Central Limit Theorem—from this
we know the mean, standard deviation, and the form of the sampling distribution of sample means, regardless of the form of the distribution of raw scores. The CLT is also applicable to other statistics such as the unstandardized regression coeﬀicient and mean difference but here we will focus on the mean.

The mean of the sampling distribution of sample means is unbiased and E(x bar ) = μ. This states that the expected value of the sample mean is equal to the population mean.
The standard deviation of the sampling distribution of sample means—known as the standard error of the mean—is σ bar = σ/√n where n is the sample size for the mean and σ is the standard devi- ation of the observations in the population.
The form of the sampling distribution is normal. More precisely, if the original distribution of individual observations is normal, the sampling distribution of the means is normal; otherwise, as sample size (n) increases, the sampling distribution → normal. This approach to normality happens very quickly and, for smaller sample sizes than you should ever use in reality, normality is a very, very, very good approximation to the form of the sampling distribution.

How well did you know this?

Not at all

Perfectly

what is the p-value?

the probability of the observed data, or more extreme, under the null hypothesis sampling distribution. In other words, if the null hypothesis is true, what is the probability of the observed statistic?
Formally, under null hypothesis significance testing, smaller p-values do not provide stronger evidence against the null hypothesis. In other words, p = .049 is considered the same as p = .0001 as both lead to the same conclusion (reject the null hypothesis). We don’t more forcefully reject the null! This dichtomous perspective
(our only choices are to either retain or reject the null hypothesis) ignores the estimated effect size and the precision associated with that effect size.

How well did you know this?

Not at all

Perfectly

what is the sampling distribution?

The sampling distribution is the distribution of a statistic (e.g., mean, standard devia- tion, etc) calculated from all possible distinct random samples of the same size n drawn from the population. We often work with theoretical sampling distributions where we are able to determine from theory the shape and nature of the sampling distribution under the null hypothesis (e.g., given that the null hypothesis is true).

How well did you know this?

Not at all

Perfectly

What is the expected value?

Study These Flashcards

The mean of the sampling distribution.

what is standard error?

Study These Flashcards

The standard deviation of the sampling distribution.

what are the three types of hypotheses considered in a scientific experiment?

Study These Flashcards

scientific, null hypothesis, alternative hypothesis

what is a scientific hypothesis?

Study These Flashcards

The scientific hypothesis is the relationship or effect that the researcher expects to find. This may come from theory, clinical experience, prior research, or other sources. This hypothesis is the simple verbal explanation of the effect or finding that you expect. Examples:
– UBC undergrads have IQs above the mean Canadian popula-tion.

– Boys are more aggressive than girls.

– A new therapy technique will lead to lower depression scores than the standard therapy technique.

In order to use the indirect route for hypothesis testing, we must restate the scientific hypothesis into a null hypothesis and an alternative hypothesis.

what is the null hypothesis?

Study These Flashcards

This is the hypothesis that is to be formally tested and is often symbolized as H0.

This hypothesis must involve a prediction about the exact value of a population parameter (or the difference between popula- tion parameters, ratio of parameters, etc.). We need to do this in order to make a probability statement concerning the sample result and to construct and use a theoretical sampling distribu- tion. The statement is assumed (hypothesized) to be true in the population and is always phrased as a statement which contains an equality symbol (≤, ≥, or =).

what is an example of a correct H0s?

what is an example of an incorrect or invalid H0s?

Study These Flashcards

correct:
1. H0 :μ=120
2. H0 :μ≤100
3. H0 : σ = 11
4. H0 : μ1 − μ2 = 0

incorrect:
1. H0 :μ=10,11, or12
2. H0 :μ>100
3. H0 : s = 11

what is an alternative hypothesis?

Study These Flashcards

Symbolized as H1 or Ha, this is the hypothesis which will be accepted if H0 is rejected. The alternative hypothesis is always phrased as a statement about the population parameters which contains an inequality symbol (<, >, or ̸=). CorrectexamplesofH1 wouldbeH1 :μ<120,H1 :σ̸=11
Query: Why is the null and alter- native hypothesis pair H0 : σ = 0 andH1:σ>0valid?Theanswer
is because the population standard deviation σ cannot be negative, this pair of null and alternative hypotheses are mutually exclusive and exhaustive.
hypothesis testing: inference and errors 4
The Null and Alternative Hypotheses form a pair and together must be mutually exclusive and exhaustive. There is no overlap be- tween the two hypotheses and together they cover every single possi- bility.

What are examples of correct H0 and H1 pairs?

Study These Flashcards

H0 :μ=120andH1 :μ̸=120
H0 :μ1−μ2 =0andH1 :μ1−μ2 ̸=0
H0 :σ≤10andH1 :σ>10
H0 :σ=0andH1 :σ>0

what is the region of retention and rejection?

Study These Flashcards

The region of retention is region of observed sample means where you would retain the null hypothesis. The region of rejction is the region of observed sample means that would lead you to reject the null hypothesis. Both regions are defined and based on the null hypothesis sampling distribution.

what is the concept of the test statistic?

The goal of null hypothesis significance testing is to determine the p-value, the probability of the observed data, given that the null hypothesis is true. Working in the raw metric of sample means is often not eﬀicient although it certainly is suﬀicient. Instead we will often transform that data (e.g., a sample mean) into a common framework. That is the test statistic. Test Statistic: There is a general formula that is often used and we will see this often when we examine z- and t-tests. The test statis- tic is simply the number of standard errors from the null hypothesis sampling distribution. Test statistics are all based on the theoretical sampling distribu- tion of the null hypothesis—the sampling distribution of the test statistic of interest given that the null hypothesis is true. * This allows us to use common reference distributions (z, t, etc.) to determine probabilities of the observed data under the null hypothesis—the p-value. * The general formula for the test statistic is given below where θ is the population parameter of interest (e.g., μ) that is estimated by statistic θˆ (e.g. X bar ).

what is the relationship between p-values and regions of retention and rejection?

p-values are directly related to the regions of retention and rejection. If p ≤ α, then the observed statistic is in the region of rejection. If p > α then the observed statistic is in the region of retention. It does not matter if you calculate the regions using the original statistic (e.g., means) or test statistics as these will provide the exact same answer.

Lecture notes 2 Flashcards

(27 cards)