Data collection Flashcards
(42 cards)
population
A population in statistics means all the individuals/objects you are interested in for a particular investigation
eg. all 6-year-old girls in the UK, all items manufactured by a factory or all the trees in a public park.
census
A census observes or measures every member of a population.
sample
A sample is a selection of observations taken from a subset of the population which is used to find out information about the population as a whole. We then assume that the results for this sample are representative of the whole population.
Census pros
Should give a completely accurate result
Census cons
Time consuming and expensive
Can’t be used when testing process destroys the item
Hard to process a large amount of of data
sample pros
. Less time consuming and expensive than a census
. Fewer people have to respond
. Less data to process than in a census
sample cons
- The data may not be as accurate
- The sample may not be large enough to give information about small sub-groups of the population
sampling units
Individual units of a population
sampling frame
A list (or other representation) of the items available to be sampled
sampling fraction
The proportion of the available items that are actually sampled is called the sampling fraction. A 100% sample is called a census.
sampling error
The difference between an estimate of a parameter (e.g. mean) derived from sample data and its true value. To reduce the sampling error, you want your sample to be as representative of the parent population as you can make it.
bias
different types of people should be represented in the sample that is chosen. If the sample involves a more of certain group of people within the population, then it is said to be biased. To make good use of a sample we want to avoid bias.
representative sample
A sample that is typical of the whole population.
Random Sampling Techniques
- Simple random sampling
- Systematic sampling
- Stratified sampling
Simple random sampling
A simple random sample of size n is one where every possible sample of size n has an equal chance of being selected. This can be achieved by ensuring every member of a finite population has an equal chance of being selected as long as sampling is without replacement and selections are independent of each other.
two methods of choosing the numbers in Simple random sampling
- Using a random number generator (using a calculator, computer or random number table).
- Lottery sampling – eg. writing members of the sampling frame on tickets and drawing them out of a bag.
Simple random sampling - pros
- Free of bias
- Easy and cheap to implement for small populations and small samples
- Each sampling unit has a known and equal chance of selection
Simple random sampling - cons
- Not suitable when the population size or the sample size is large
- A sampling frame is needed
Stratified sampling
In stratified sampling, the population is divided into mutually exclusive strata and a random sample is taken from each.
Divide population into sub-groups or strata: e.g. low income, middle income, high income, male, female
proportional stratified sampling
If we randomly sample from each group in proportion to the size of the group then it is called proportional stratified sampling.
stratified sampling - pros
- Sample accurately reflects the population structure
- Guarantees proportional representation of groups within a population
stratified sampling - cons
- Population must be clearly classified into distinct strata
- Selection within each stratum suffers from the same disadvantages as simple random sampling
Systematic sampling
In systematic sampling, the required elements are chosen at regular intervals from an ordered list.
From a list, choose a random starting item, then sample, for example, every 5th item.
To determine the interval required you divide the population by the required sample.
Systematic sampling - pros
- Simple and quick to use
- Suitable for large samples and large
- populations