Year 1 - Statistics Flashcards
(69 cards)
1.1 What is a census?
A census observes or measures every member of a population
1.1 What is a sample?
A selection of observations taken from a subset of the population which is used to find out information about the population as a whole.
1.1 What are the advantages and disadvantage of a census?
Adv- Completely accurate result
Disadv- Time consuming, expensive, cannot be used when testing process destroys item, hard to process as large quantities of data
1.1 What are the advantages and disadvantages of a sample?
Adv- time-efficient, fewer people have to respond, less data to process
Disadv- Not as accurate, sample not large enough to give information about sub-groups of the population
1.2 What are the three methods of random samplng?
Simple random sampling - every member has an equal chance of being selected
Systematic sampling - required elements chosen at regular intervals from an ordered list
Stratified sampling - population is divided into mutually exclusive groups (e.g. males & females) and a random sample is taken from each
1.2 What are the advantages and disadvantages of simple random sampling?
Adv- No bias, easy & cheap to do for small samples, each sampling unit has an equal chance
Disadv - Not suitable when population is large as time-consuming, sampling frame is needed
1.2 What are the advantages and disadvantages of systematic sampling?
Adv- Simple & quick, suitable for large samples & populations
Disadv- Sampling frame needed, can introduce bias if sampling frame is not random
1.2 What are the advantages and disadvantages of stratified sampling?
Adv- Accurately reflects population structure, guarantees proportional representation of groups
Disadv- Population classified into distinct groups (strata), selection within each stratum suffers with same disadvantages as simple random sampling
1.3 What is quota sampling?
A researcher selects a sample that reflects the characteristics of the whole population
1.3 What is opportunity sampling?
Taking the sample from people who are available at the time of the study and who fit the criteria of the study
1.3 What are the advantages and disadvantages of quota sampling?
Adv- Allows a small sample to be representative of the population, no sampling frame, quick, easy, cheap, easy comparison between different groups
Disadv- Non-random sampling can introduce bias, population is divided into groups - costly/inaccurate, increasing scope of study increases no. of groups - time-consuming
1.3 What are the advantages and disadvantages of opportunity sampling?
Adv- Easy, cheap
Disadv - Unlikely to be representative, highly dependent on individual researcher
1.4 What is the difference between qualitative and quantitative data?
Qualitative - non-numerical observations
Quantitative - numerical observations
1.4 What is the difference between discrete and continuous data?
Discrete - A variable that can only take specific values in a range e.g. shoe size
Continuous - A variable that can take any value in a range e.g. time
1.5 What are the 8 cities in the large data set?
Leuchars, Leeming, Heathrow, Hurn, Camborne, Beijing, Jacksonville, Perth
1.5 What are the following measured in? Daily mean temp, daily total rainfall, daily total sunshine, daily mean wind direction and windspeed, daily max gust, daily max relative humidity, daily cloud cover, daily mean visibility, daily mean pressure
Daily mean temp - degrees Celsius (1dp)
Daily total rainfall - mm (1dp)
Daily total sunshine - tenth of an hour
Daily mean wind direction - Cardinal directions
Daily mean windspeed - Knots (1kn = 1.15mph)
Daily max gust - knots
Daily max relative humidity - percentage of air saturation (%)
Daily cloud cover - oktas (eighths of the sky covered)
Daily mean visibility - Decametres (Dm)
Daily mean pressure - Hectopascals (hPa)
1.5 What time periods are used in the Large Data Set?
May-October 1987 & 2015
2.1 What is the formula you can use to calculate the mean from a set of data?
x̄ = (Σx)/n where x bar is the mean, x is each data value, and n is the number of data values
2.1 What is the formula you can use to calculate the mean from a frequency table?
x̄ = (Σxf)/(Σf) where x bar is the mean, x is each data value, and f is each frequency
2.2 How do you find the upper and lower quartiles for discrete data?
LQ: divide n by 4, if a whole number then LQ between this data point and one above, if a decimal then round up
UQ: Find 3/4 of n, if a whole number the UQ is between this data point and the one above, if a decimal round up
2.2 What is interpolation used for and how do you do it?
Used to find the median, quartiles, or percentiles of a grouped frequency table, assuming data values are distributed evenly within each class
Median= LB + ((n-a)/(b-a) x range) where LB is lower bound, n is the middle value, a is the lower frequency bound and b is the upper frequency bound
2.3 What is the range, IQR, and interpercentile range?
Range - difference between largest and smallest values
IQR - difference between upper and lower quartiles
Interpercentile range - difference two given percentiles
2.4 Give the formula for variance
((Σx^2)/n)-((Σx)/n)^2
2.4 Give the formula for standard variation
sqrt(((Σx^2)/n)-((Σx)/n)^2)