Statistics Data Collection Flashcards
(26 cards)
Census
Observes/measures every member of the population .
✅completely accurate result
❎time consuming + expensive.
Sample
Observations taken from subset of population which is used to find out info about population as a whole.
✅less time consuming + cheaper as less data to process than a census.
❎sample not large enough to give info about small sun groups.
Simple random sample of N
Every sample size N has equal chance selection
Sampling units
Individual units of population
Often individually named/ numbered to form sampling frame.
Types of random sampling
Simple random, systematic, stratified
Simple random sample
Of size N is where every sample of size N has equal chance of selection.
Need a sampling frame (list of people/things).
Each item allocated number + selected at random.
Can generate random number e.g. using calculator or lottery sampling (items written on tickets + put into hat).
✅each sampling unit has equal chance of selection.
❎unsuitable when population size large as time consuming + expensive.
Systematic sampling
Required elements chosen at regular intervals form ordered list.
First person to be chosen should be chosen at random.
✅suitable for large samples + large populations.
❎bias if sampling frame isn’t random.
Stratified sampling
Population divide to mutually exclusive strata (e.g males + females) + random sample taken from each.
(N in strata/ population x overall sample size)
✅proportional representation of groups in population
❎population must be clearly classified into distinct strata.
POPULAITON divided to strata which is expensive
Quota sampling
Select sample that reflects characteristics of whole population.
✅quick- no sampling frame.
❎must divide population to groups can be expensive.
Opportunity sampling
Sample who’s available at the times the studies carried out.
✅easy carry out
❎depends on when the researchers available
Unlikely to be representative
Continuous
Any value in a given range e.g time.
Discrete
Specific values elf number of apples.
Describe type of data presented by daily total rainfall
Continuous quantitative
Explain why Alison’s process may not generate a sample of size 5
Some data values are (n/a)
Daily maximum relative humidity
As a % of air saturation with water vapour.
Relative humidities above 95% have more foggy + misty conditions.
Daily max gust
Knots
Highest existing windspeed recorded, direction from which it’s blowing is also recorded.
Daily mean temp
Average of hourly temp in 24hour period.
Daily tr
Includes snow and hail , melted before measuring. Amounts less than 0.05mm are ‘tr’
Daily mean wind direction + windspeed
Knots in Kn
Averaged over 24 hours midnight to midnight.
Mean- as bearings and compass directions.
Data categorised according to Beaufort scale.
Okta
Maximum figure for cloud cover is 8
How to clean data that contains TR
replace TR with a numerical value from 0 to 0.05 e.g. 0.025
What does the data cover and why is this is a limitation
Data only covers May- October, not a representative of the whole year
What months are you,issuing and what impact does this have
Winter months are missing.
We would expect the mean rainfall to be larger including them
Sampling frame
LIST OF..