01. Flow Basic Flow Metrics Basics Flashcards
(54 cards)
What are the different types of Averages?
- Mean
- Median
- Mode
What is an average
- It is the values that are most representative/typical of the dataset.
How do you calculate the mean
Add all the numbers together, and then divide by how many numbers there are.
What are outliers
An extremely high or low values that stands out from the rest of the dataset
How do outliers change the representation of the dataset
- They skew the representation depending on the outliers.
- Outliers “pull” the data to the left or right
How do outliers affect the mean?
Outliers pull the mean higher or lower, skewing the data to the left or right.
Mean won’t give you the best representation of what a typical value is
What does it mean if the data is skewed to the right
Data that is skewed to the right has a “tail” of high outliers that trail off to the right.
What does it mean if the data is skewed to the left?
Data that is skewed to the left has a “tail” of high outliers that trail off to the left.
What does it mean if the data is Symmetric
If the data is symmetric, the Mean, Median and Mode are in the middle.
No outliers pull the mean in either direction, and the data has the same shape on either side of the centre.
Which averages are not affected as much by outliers?
- Median
- Mode
How is the Median found?
- Line up all the values in ascending order.
- If there are an odd number of values, the
median is the one in the middle. - If there are an even number of values,
add the two middle ones together and
divide by two.
Where is the Mean and Median in a right-skewed dataset
If the data is skewed to the right, the mean is to the right of the median (higher).
What is the Mode?
- The mode of a set of data is the most popular value, the value with the highest frequency
- The mode has to be in the data set.
- It’s the only average that works with categorical data.
Where is the Mean and Median in a Left-skewed dataset
If the data is skewed to the left, the mean is to the left of the median (lower).
Can a dataset have more than one Mode?
- If there is more than one value with the highest frequency, then each one of these values is a mode.
- If the data appears to represent more than one trend or set of data, we can provide a mode for each set.
- This would be a bimodal dataset.
What are the Three steps for finding the mode?
- Find all the distinct categories or values in your data set.
- Write down the frequency of each value or category.
- Pick the one(s) with the highest frequency to get the mode.
Flow metrics
What metrics can be used when analysing flow
- Range
- Quartiles
- Percentiles
- MAD - Median of the absolute
Define Range.
- We can use the range to measure Variability / Spread?
- The range lets us know how the data varies
- The more variability, the less predictable the source of the data is
What is the Use of range in Flow Analysis?
Shows full Spread / Width of flow times.
How do we measure the range?
- The range measures the spread/width of a dataset.
- It’s given by Upper bound - Lower bound.
- Where the upper bound is the highest value, and the lower bound is the lowest value.
What are the Advantages of Range
- Simple to calculate
- Highlights extreme variation
What are the disadvantages of Range
- Sensitive to outliers
- It doesn’t tell whether the values are clustered or spread out. It ignores whether values are bunched near the median or scattered all over.
- It can be misleading if you assume it reflects “overall variation.”
- It only uses two data points—the smallest and largest—and doesn’t care what happens in between.
- It can give a sense of less predictability because of the wide range
When is the Range used?
When you want a quick view of a variation.