Lecture 1-4 Flashcards
What do you have be to double-check when reading values in histogram and bar graphs(clustered, stacked, etc)?
If the values are row, column totals, or sample totals(n)
What are charts that display bivariate relationships?
Panelled pie charts, panelled histograms, stacked/ clustered bar graphs,
What characteristics do descriptive data focus on?
Centre pf data-main
Spread of data-average
Shape of data-ratio level variables only
What characteristics do descriptive data focus on?
Centre pf data-median
Spread of data- mean average
Shape of data-ratio level variables only
What does descriptive stats depend on?
Level of measurement for variables
Descriptive stats- define and describe the difference between mode, median and mean
Mode: most common.
Median: the middle of the data. “50th percentile”
Mean: average of data.
Which level of measurement is used for which description, and why?
4 leavels of measurement from least to most accurate:
Dictonomous: Mode
Nominal: Mode
Ordinal: Mode, Median
Ratio: Mode, Median, Mean
Ordinal variables: can’t do Mean because can’t apply mathmatical formula
Ratio: can do Mean beacuse can do math formula: total of values/ total of sample
What does cell frequency mean?
“Hard counts” of a cell
What does IAP mean?
Not applicable
In frequency percentage, What is the difference between percent vs. valid percent vs. culumulative?
Valid percent excludes missing data.
Culumulative percent: culumulating previous valid percent and helps to find Median of data
What is the equation for mean? What does each of the symbol mean?
Mean= Sigma X/ n
How to find median in ordinal variables?
By stacking values in hierachy. culumulative percent. Cannot locate the exact value, but only rough
What does measures of variation tell us about data? How is it different from central tendencies?
Measures of variation tell us - how spread out the data is. It is important because
What is a range?
Range: difference b/w max and min
What is the Inter-Quartile Range and why is it important?
-Distance between the 25th and 75th percentile, excluding the top and bottom 25 percentile
-Important because it is an effective way to avoid calculating “outliers” if there are a lot present
What is the level of measurement data that range has the best use for?
Ratio level variables. Sometimes also ordinal if it has a large range.
What is a boxplot?
In SPSS, a visual graph that displays the RANGE- median, minimum, maximum, IQR, and flags extreme values
What is Standard Variation and why is it important?
- Standard Variation(SD) is a mathmatical formula calculating how spread out the data is.
- SD relies on histogram graph curve to show how spread out the data is
- SD is ONLY calculated for ratio level data.
- SD is compared in relation with the Mean, which has a value of 0
- SD values include +/-1,2,3
What you must do when crafting a histogram?
Add the overlaying CURVE
What is the equation for sample SD?
S = √∑ (X - M) 2 / n - 1
What does Sigma mean?
culumative or sum of something
What is a Normal Distribution on histogram and why is it important? What are the important characteristics of ND?
- A set of Data that has a symmetrical curve on a histogram
- Mean, median and mode should be roughly the same
- “unimodal”- one value of mode
- A lot of types of statstics need the Normal Distribution in order to be used.
What are the Standard Deviations in a normal distribution?
- 68% with +-1 SDs
- 95% within +-2 SDs
- 99% within +-3 SDs
How do you interpret and explain SD
i.e. 50% fall between +-1 SD AWAY from the mean
What is a Z Score and what is the mathmatical formula calculation
Standard Deviation expressed in Units of whole number values
Value indicates how Far away from the Mean a particular “case” is
basically Positive/Negative Z score indicate if case is above or below Mean
Z =(Case Value - Mean)/ Standard Deviation Value
Skewed Distrbution or “SKEW”
WHy does it happen?
- Data set has too many positive or negative Outliers “skew” the distribution
- Positive or negative outliers mean outliers that either have positive or negative Standard Deviation Value
What is a Kurtosis? How is it different from SKEW?
What are the descriptive termsof Kurtosis?
Kurtosis shows on the Y axis of the histogram. Skew shows on the X axis.
In another words, How “Long/peaky or flat/plateaued” the shape is
-Long/peaky: Leptokurtic
-Flattened/Plataeued: Platylkurtic
-
How are Standard Deviation and Z Scores different?
Essentially
Std. Dev measures the deviation Score of the WHOLE sample or population
Z score measures the deviation Score of a SPECIFIC data (like an individual)
Standard Deviation is a measurement that describes the Overall Shape of the Dataset. (fat, skinny, skewed, kurtosis), as opposed to a Specific data value.
Z score is a Unit of Measurement like the metric system. It is used to show a Specific Data Value’s standard deviation on the distribution curve.