Stats Flashcards
(10 cards)
Sample
Information obtained from a small group in order to make valid conclusions about a population. Greater the sample size, better representation of the population.
What are the four sampling methods?
Stratified sampling, clustered sampling, systematic and random sampling
What is stratified sampling plus pros and cons?
Use prior knowledge of a population in the sample; stratify sample size to fit pop.
+ get more precise estimates of population parameters
- every member must fall into a strata
Clustered? + & -?
Divides the population into groups and randomly selects a number of those for sampling
+ can save time and cost of obtaining data
- only suitable if groups heterogenous (not if clusters homogenous)
Systematic? + & -?
Numbers the individuals from 1 to n then selects the individuals at regular intervals
+ Is easy and sample spread through the population
- could be a regular pattern of variation in pop so sample could be biased (dampen / amplify pattern) and can render CI’s invalid
Random Sampling?
Every member of the population is given an equal chance of getting into the sample. If we don’t use this any CI’s calculated may not be valid- critical to include randomness in all sample types
Sampling Frame
A limited group from which we select our sample as it is impractical to sample whole population. This may mean inferences aren’t strictly valid in terms of whole pop.
What do we look for in preliminary data analysis?
Descriptive stats: Mean = median = normal distribution
- Stdev = tells us the variability in the data (tight and clustered means small)
- Equal variances = if different can’t use parametric testing (large/small = fcrit < fcalc then different df = n-1)
PDA: What does a dotplot show?
Look at the shape of distributions if they’re symmetric or skewed (similar spread) Outliers and best used when comparing one data set
What does a boxplot show?
Wanting to find difference in distributions
-Shows box with upper and lower quartiles (25th and 75th) and median sample range. Are used to assess and compare sample distributions between populations or locations