Descriptive Statistics Flashcards
Which comes first? Descriptive or Inferential?
- Descriptive
- Inferential
Descriptive Statistics ?
Descriptive statistics is a means of describing features of a data set by generating summaries about data samples.
Inferential Statistics?
Inferential statistics use measurements from the sample of subjects in the experiment to compare the treatment groups and make generalizations about the larger population of subjects.
What are the measures of central tendency?
- Arithmetic Mean
- Median
3.Mode
Arithmetic Mean?
Average Value
What is arithmetic mean suitable for?
Suitable for symmetrical distributions
Symmetrical Graph Shape and distribution ?
Bell-shaped & Normal Distribution
Asymmetrical Graph Shape and distribution ?
Distribution skewed to right
Positively skewed
What affects the mean of asymmetrical graphs?
The outliers
Median?
The median is the value in the middle of a data set, meaning that 50% of data points have a value smaller or equal to the median and 50% of data points have a value higher or equal to the median.
Why is median not affected by outliers?
Median is a robust measure thus not affected but outliers
What is median ideal for?
Asymmetrical Distribution
Why is median ideal for asymmetrical data?
It is a robust measurement and not affected by outliers
Measures of central tendency?
Central tendency is a descriptive summary of a dataset through a single value that reflects the center of the data distribution. Along with the variability (dispersion) of a dataset, central tendency is a branch of descriptive statistics. The central tendency is one of the most quintessential concepts in statistics.
Which measure of central tendency do you use for a/symmetrical data?
Symmetrical Data-arithmetic mean
Asymmetrical Data-Median
Mode?
Most frequent value
Why is mode not affected by outliers?
It is a robust measurement
Robust measurement?
Robust measures of scale are methods that quantify the statistical dispersion in a sample of numerical data while resisting outliers. The most common such robust statistics are the interquartile range (IQR) and the median absolute deviation (MAD).
How can we know that a graph is symmetrically distributed ?
The measures of central tendency will be close together.
For asymmetrical distribution has measures of central tendency spaced out
Measures of spread?
- Variance and standard deviation
- Range
3.Interquartile Range
Variance?
Average squared distance from the mean
Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value.
Why don’t we use variance?
Variance gives you a squared unit and thus not easy to use and apply
Standard Deviation?
Square root of the variance
The standard deviation is the average amount of variability in your data set. It tells you, on average, how far each score lies from the mean.
Why is standard deviation ideal to use?
-Doesn’t have squared root
-Affected by outliers
-