Statistics Flashcards
(51 cards)
What does PMCC measure?
Product Moment Correlation Coefficient measures how correlated two variables are, giving an r number between -1 and 1.
1 indicates a perfect positive correlation, -1 indicates a perfect negative correlation and 0 indicates no correlation
What is a regression line?
The best line of best fit, minimising residuals. It always passes through the mean of x and y
What is the equation for P(A|B)?
P(A|B) = P(AnB) / P(B)
What is the equation for P(AnB)?
P(AnB) = P(A) x P(B|A)
What does it mean for an event to be independent?
When the occurrence of one event doesn’t affect the probability of another
How could you test if an event is independent?
If they are independent, P(AnB) = P(A) x P(B)
What does it mean for events to be mutually exclusive?
The events cannot occur at the same time
How would you check if two events are mutually exclusive?
P(AnB) = 0, so P(AuB) = P(A) + P(B)
What is a difference between a population, a sample, and a sampling frame?
A population is the whole group, while a sample is a selected group from the population. A sampling frame is a list of all the members of the population
What are the different sampling methods?
- Census
- Simple random sampling
- Systematic sampling
- Stratified sampling
- Quota sampling
- Opportunity sampling
What is a census? Advantages? Disadvantages?
Collects data about all the members of a population.
Gives accurate, unbiased results, but is time-consuming and expensive, and can use-up all members of a population if they are consumables
What are the advantages and disadvantages of using sampling over census?
Is quicker and cheaper, and leads to less data needing to be analysed, but might not represent population accurately and could introduce bias
What is simple random sampling and how would you carry it out?
A sample of size n is taken where every member of the population has an equal probability of being selected.
Uniquely number every member of a population, and randomly select n numbers from a random number generator
When should simple random sampling be used? Advantages? Disadvantages?
Should be used when you want a random sample to avoid bias.
Is unbiased and useful in a small population, but inconvenient for very large or spread out populations
What is systematic sampling and how would you carry it out?
A sample is formed by choosing members of a population at regular intervals using a list.
You would calculate the size of the interval (population size / sample size), and choosing a start point.
When should systematic sampling be used? Advantages? Disadvantages?
Should be used when you want a random sample from a large population.
Useful when there is a natural order, but can’t be used if it isn’t possible to list all members of the population, and in order for the sample to be random the sampling frame needs to be random
What is stratified sampling and how would you carry it out?
The population is divided into groups called strata, and a random sample is taken from each group.
Population could be split into strata by defining characteristics. Then the number of members to be sampled from a stratum = (size of sample / size of population) x number of members in the stratum
When should stratified sampling be used? Advantages? Disadvantages?
Should be used when the population can be split into obvious groups of members.
Useful when there are very different groups of members within a population, sample will be representative of the population structure, sample from each group is random, but can’t be used if population can’t be divided into discrete groups
What is quota sampling and how would you carry it out?
The population is split into groups and members of the population are selected until each quota is filled.
If a member doesn’t want to be included, another member is chosen instead, and the members don’t need to be selected randomly
When should quota sampling be used? Advantages? Disadvantages?
Should be used when a small sample is needed to be representative of the population structure.
Useful when a sampling frame is not available, but can introduce bias as some members may choose to not be included
What is opportunity sampling?
A sample is formed using the first available members of a population who fit the criteria
When should opportunity sampling be used? Advantages? Disadvantages?
Should be used when a sample is needed quickly.
Useful when a list of the population is not possible, but unlikely to be representative of the population structure
What is variance?
The standard deviation squared. A measure of the spread within a set of data.
What is standard deviation?
Measures how far, on average, each data point is from the mean