Statistics Flashcards
what are three reasons you would use the sample instead of the population data?
1. population data is not available
2. population data is available but is so large it would be v difficult to analyse
3. sample data is quicker
variance symbol
σ2 
sigma squared
discreet vs continuous variable
discreet is countable (amount in bank account), continuous is measureable (time)
e.g. age is a continuous variable because you are 25, and 40 days, and 4 hours, a 3 minutes, and 2 seconds, a 23 picoseconds. You can make age discreet by limiting it to your age in years.
variance definition
average of the squared difference from the mean
vector (in R)
way to store data
parameter
a characteristic of a population
statistic
a characteristic of a sample 
vs a parameter which is a character of a population
4 levels of measurement
- nominal : categories, not ordered e.g. race 
- ordinal : ordered but differences are meaningless e.g. rank in a race, 1st, 2nd, 3rd
- interval : ordered and differences are meaningful but there is no natural zero e.g. temperature 
- ratio : interval measurements where there is a natural zero e.g. money
why do you square every difference from the mean when calculating variance?
to make them all positive numbers 
e.g. if you had a 5 and a -5 they would cancel eachother out, so the variance would 0.