Lecture 18 ARM Flashcards
Probability in Anthropological Research - 5/7 (19 cards)
Boxplot as a 5 number summary
- Minimum value (whisker)
- First quartile (Q1)
- Median (middle line)
- Third quartile (Q3)
5.Maximum value (whisker)
Min and max here are not including the outliers - they are plotted outside of the boxplot itself as a cheeky dot
Probability in research
- Science and uncertainty: all scientific claims are probabilistic
- Probability basics: predicting likelihood of events
- Anthropology context: probability helps quantify uncertainty in our findings
- From probability to sampling: Enables us to generalise from sample to population
- Goal: Use probability to make informed inferences, NOT absolute proofs - they never say it is a 100% accurate
Why do we sample?
- It is impossible to study everyone: Too large/dispersed populations
- Resource limits: Time, money, personal constraints
- Access and ethics: Some are hard to reach, or sensitive to involve
- Less intrusive: Sampling can reduce burden on the community that is studied - not everyone is studied or involved
- Speed and feasibility: Enables timely research results by focusing on fewer
Representative sample
Reflects population’s diversity and key characteristics
Eg half men and women, spread of ages, jobs etc
Sampling bias
Systematic over- or under-representation of some group or trait - occurs usually based on how you choose to sample
Eg surveying only market-goers.
Consequences: Biased sample = skewed findings - not true for population
Anthropology impact: misrepresentation of a community can lead to flawed or even harmful conclusion eg stereotypes
Probability sampling (random selection)
Every member has a known, non-zero chance of selection
Simple-random sampling
Pure-lottery method - eg random number generator picks people
Stratified sample
Divide population into subgroups, sample then randomly within each groups - eunsure representation of key categories eg gender, age, ethnic groups
Cluster sampling
Sample groups or clusters (eg villages, households), then individuals within clusters
Usually national surveys or urban - can create a bit dispersed results if areas vary a lot
Systematic sampling
Select every n-th individual from a list after a random start.
Eg every second person, every 10th
Non-probability sampling (when random is not possible) (!)
If random selection is not feasible, possible for whatever reason
- Convenience sampling
- Snowball sampling
- Purposive sampling
Pros: Easier and sometimes the only option
Cons:Higher risk of bias and limited generalisability.
Anthro reality: Common in ethnography due to practical constraints and requires careful interpretation
Convenience sampling
Selecting whoever is readily available (ease over randomness)
Snowball sampling
Participant recruit others - useful for hard to reach groups
Possible to capture social networks, bounce of others words in new interviews
Purposive sampling
Deliberately selecting individuals for their knowledge or characteristics
Example - Program Evaluation (from reading, Bennett &Hays 2023)
Study focus: Civic engagement program EYPC for youth in Illinois
Population = all youth in the program
Sample = participants who took surveys voluntarily
Sampling method: Non-probability (program-based)
Implication: Results apply to engaged youth in program - not necessarily all youth in general
Key finding: Participants showed increased teamwork, leadership etc after program
Sample sizes importance
- General rule is that a larger sample size produces more reliable estimates (aka less random error)
- Stability - small sample sizes can give wild results ! large samples smooth out the outliers
- Diminishing returns: beyond some point, doubling sample yields smaller gain in precision. going from 10-100 makes a huge difference. but going from 100 to 200 makes less! at some point, increasing the N might not help that much
Anthro context : often small-N studies - recognise that results may be tentative, cannot make broad generalisations
Key idea: Larger N reduces the influence of random luck who was sampled
Margin of error
Definition: The radius of confidence interval; an estimate of how far off the sample result might be from the true population value
Interpretation: “ + or - x% or y units” - range around the sample statistic likely to include the population parameter
Driven by sample size:Larger sample = smaller margin of error aka more precision
Example: Survey result 60% + or - 5% means true population value could be 55% to 65%
Given the sample size, the true population percentage is likely within a percentage range. Eg “between 55-65% support this policy”
Use in anthro: Rare in ethnography, but common in surveys- conveys result reliability (important for presenting quantitative findings responsibly)
Key: this is where chance come into play when we sample - why it is important so see the sampling methods and how it is applied to larger population
Confidence interval very easily understood
The population value minus margin of error
Eg population value: 80%
Margin of error: +/- 10 %
Confidence interval = 70% - 90%
Confidence Intervals elaborated
Confidence interval (CI): A range of values, derived from the sample, that is likely to contain the TRUEpopulation value
Confidence level: typically 95% in social sciences. this means that if we sampled 100 times, about 95% of those CI would contain the true value
Interpretation: “We are 95% sure that the true mean/proportion lies between x and y.”
Significance: If a CIrange is narrow = precise knowledge.If CIrange is wide = estimate is uncertain.