Lecture notes 1 Flashcards
(11 cards)
what is a population?
The entire set of individuals or observations of interest. For instance, if you are developing a new treatment for anxiety, the population may be all individuals who suffer from anxiety. Generally N refers to the size of the population (which may be infinite).
What is a sample?
A subset of the population. For instance, a random sample of individuals who suffer from anxiety. The size of the sample is denoted by n. For measures of association like the correlation n is the sample size. However, for groups (t-test or ANOVA), n little g is the sample size of group g.
What is a statistic?
A numerical summary of a sample. For instance, the mean level of anxiety in your sample of participants. Generally represented by an italicized Roman letter (e.g., italicized s for the sample standard deviation and M or X bar for the sample mean).
What is a parameter?
A numerical summary of the population. Generally represented by a Greek letter (e.g., σ for the population standard deviation and μ for the mean).
What is a sampling distribution? Example? What are specific terms related to the sampling distribution? (3)
The distribution of a statistic across all possible distinct samples from the population.
For instance, if you gathered every single possible distinct sample from the population, all of the same sample size n, and calculated the mean level of anxiety in each sample, the distribution of those sample means is the sampling distribution of the mean (of size n).
Think: the sampling distribution is the distribution across the samples. Specific terms related to the sampling distribution, defined below, are expected value, standard error, and bias.
What is the expected value?
The mean of the sampling distribution. This is denoted as E(). For instance, the expected value of the sample mean is E(X bar ) = μ.
What is standard error?
The standard deviation (SD) of the sampling distribution. We give the SD of the sampling distribution its own name so that we (hopefully!) minimize confusion with the standard deviation of the raw scores.
What is bias?
The difference between the expected value of a statistic and its corresponding population parameter. The sample mean is unbiased as E(X bar ) = μ. A statistic such as the sample range, is biased since E(Range subscript Sample) < Range subscript Population.
what is the p-value?
The p-value is the probability of the observed data, or data more extreme, given that the null hypothesis is true. This last part is critical. It is a probability under the assumption that the null hypothesis is true. It does not tell us the probability of the null hypothesis. It generally is calculated using the sampling distribution of the test statistic when the null hypothesis is true.
What is the central limit theorem?
For normally distributed observations, the sampling distribution of the mean is normally distributed and unbiased as E(X bar) = μ.
The variance of the sample means around the population mean (μ) is σ2/n. The standard error (standard deviation of the sampling distribution) of the sample mean around the average of the sample means is s subscript x bar = sx √n where s subscript s is the standard deviation of the raw scores on variable X and n is the sample size used to calculate the mean.
We also know that if the raw scores are not normally distributed, then the sampling distribution of the sample mean approaches normality with the above mean and standard error, as the sample size increases. Even with extremely non-normal distributions the CLT kicks in and the sampling distribution is very, very close to normal with samples sizes as small as n = 20 to 30.