Midterm Flashcards
(56 cards)
Descriptive statistics
just a way to describe the data - charts and graphs
Inferential statistics
allows you to make predictions from the data
- raw data put through statistical tests to come up with a conclusion about a population
- allows generalizations from a sample group
nominal level of measurement
named categories
-yes or no, race, gender, country, ethnicity, hair color
ordinal level of measurement
categorical data that’s ranked or ordered - has innate order
example: good, fair, poor; strongly agree to strongly disagree
* Weight could also be ordinal like boxes with >50kg, <50kg, etc.
interval level of measurement
equal distance b/w each value
Difference b/w 30 and 35 degrees Celsius is the same measurement as b/w 40-45 degrees Celsius
ratio level of measurement
same attributes as interval measurement, but has absolute 0 and no negative values
*example: length in cm - can’t have negative cm measurement
Place of absolute zero and no negative values
Ex. Length in cm – can have zero but no negative
Interval and ratio can be difficult to distinguish b/w (she will not make us do this)
Weight could be ordinal and ratio
Data classified as discrete
Can take on one value out of a limited number of options
• Number of kids (1,2,3, etc. but can’t have 1.2)
• Heart rate, number of pregnancies, number of hospital admissions, number of students in a class, shoe sizes, number of questions answered correctly
Dichotomous – a specific discrete variable where there are only two values
• Gender (M or F) Limited number of options (only 2 options)
Under 65 or over 65 would also be dichotomous discrete and ordinal…wouldn’t be interval or ratio because there is not an equal distance b/w each
Yes or no (dichotomous, nominal)
*all dichotomous data is nominal?
Data classified as continuous
Can take on any number w/I a range
Only limited by precision of measurement tool used
Look at height – only cm marks 158cm tall, 159 cm tall but height really could be more precise
o Some tests you can only use discrete or continuous – be able to pick this out in a study
*ex. standiometer can measure height to the 1/10 or the 1/100
central tendency
measured with what level of data?
gives you typical value (average), and three ways you can determine this are mean, median, and mode. Gives us a point value, one number that represents the whole data set.
Are measured in interval and ratio level data
Mean can only be calculated with?
mean is the average - doesn’t work with categories, and can only be calculated with interval or ratio level data
Median can be calculated with?
middle
calculated with ordinal, interval, ratio
Mode can be calculated with?
any type of data
Dispersion/variability
how closely the numbers cluster around the mean, median, and mode
aka variance from the standard score - the range and spread of the data from the center
Gives information about the spread of scores and indicates how well a measure of central tendency represents the “middle/average” value in the data set
So, you will often see a median reported and then a standard deviation (so this is using central tendency & dispersion).
Variance, Standard deviation - Under DISPERSION
What level of data do you typically see these with?
What does a small SD tell you?
Variance and standard deviation you will often see with interval or ratio level data. Don’t need to be able to calculate a standard deviation. She wants us to be able to look at one and figure out what it is telling us. So, if you have an article that tells you the standard deviation of something and it’s really small – that tells you that all the data points cluster closely together, and they are all close to the mean or the median.
Variance and standard deviation are related. The square root of variance is standard deviation.
Variance
the average difference b/w the data values and mean of a data set
*the average degree to which each point differs from the mean – the average of all data points
SD
standard deviation = the avg amount that data values will vary from the mean - how closely values are clustered
*example: SD small - data is close together and variance is small, if SD is large then there are more variables in the data so it’s more spread out
Range
Very simple measure of dispersion
Calculate by taking the max value in the data set and subtract from the minimum value = range. Smaller number you have for a range the closer the data set is, and the more clustered and less variable.
Ex: 9,3,2,6,7,8,7,5 so 9-2 = 7 = RANGE
Interquartile range
difference b/w the 75th percentile and the 25th percentile
ex. 11222333445 mean is 3, 25% is 2 and 75% is 4 so, 4-2 = 2 which is the interquartile range
The range and interquartile range proves a rough estimate of the variability of a data set but doesn’t use all of the data values
Frequency distribution - curtosis of curve leptokurtic
thin - peaked curve
shows what continuous data looks like
Frequency distribution - curtosis of curve mesokurtic
more normal curve
*shows what continuous data looks like
Frequency distribution - curtosis of curve platykurtic
flat curve
*shows what continuous data looks like
Describe the normal attributes of a normal distribution - bell curve
frequency distribution of data in which the data values are equally distributed around the center of the data point; normal bell curve; mean, median, and mode equal; symmetrical-not skewed
68% - within 1 SD of mean
95% - within 2 SD of mean
99% - within 3 SD of mean
Kurtosis
measure of how peaked/flat a distribution is
Skewness
measure of whether the set is symmetrical or off center