lecture three descriptive data Flashcards
(16 cards)
what is descriptive statistics ?
Helps summarise and organise data so its easier to understand.
what descriptive data measures?
1.center of data- Average like mean, median, and mode.
2.spread of data- Measures of dispersion like range, variance, standard deviation.
- shape of distribution- how data is spread(symmetrical,skewed,kurtosis)
Describe what symmetrical,skewed,kurtosis mean.
1.Symmetrical- no skew equal on both side
2. skewed data- more data on one side, with a long tail on the other side.
3. How sharp or flat the peak of a distribution is.
3 reasons why descriptive data is useful :
Helps summarise large datasets.
Make it easy to identify patterns and trends.
Provides a quick snapshot of data without complex calculations.
What is a distribution ?
A distribution shows how values in a dataset are spread out or organised.
why its useful ?
Helps visualise trends in data
shows if data is skewed.
Helps in predicting future patterns.
1 Key measures in descriptive statistics
center of a distribution:
measures the average value in a dataset.
Examples: Mean, Median, Mode.
2 Key measures in descriptive statistics
spread of a distribution.
Shows how data is spread out or varies.
Examples: Range,variance,standard deviation.
3 Key measures in descriptive statistics
Shape of a distribution.
describe whether the data is symmetrical or skewed.
Includes skewness (asymmetry) and kurtosis (peak sharpness)
1.Central tendency(average). Mean
Mean(arithmetic average)
calculated as:
mean= sum of all values/ total number of values.
2.central tendency Median
Median(middle value)
the middle value when data is arranged in ascending order.
- central tendency Mode
Mode(most frequent values)
The value that appears the most in a dataset
Choosing the right average
shape of distribution-
symmetrical data; mean is best
skewed data; Median is better
what is percentiles and quartiles ?
Help divide a dataset into equal parts to help understand how values are spread.
percentiles
A dataset is divided into 100 equal sections. Each section is a percentile.
p1=1% to 100
Quartiles
Split data into 4 parts.
Q1(25%) lower quartile.
Q2(50%) median
Q3(75%) upper quartile
Q4(100%) maximum value
helps understand how data is spread.