Ch.4 Numerical Descriptive Techniques Flashcards

(20 cards)

1
Q

Measures of Central Tendency

A

summarize large sets of data with just one number, like an average. This could be the mean (the average you’re most familiar with), the median (the middle number), or the mode (the most frequently occurring number). These measures help us understand and communicate large sets of data more easily.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The Mean

A

often referred to as the average, is a single score that provides a typical value for all of the scores in a data set. It is calculated by summing all the values in the data set and then dividing by the number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Mean Population Calculation

A

µ = ∑x / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mean Sample Calculation

A

x̄ or M = ∑x / n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The Median

A

is the measure of central tendency that represents the midpoint of a distribution. It is particularly useful in skewed data sets or when there are a lot of outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

To Find the Median,

A
  1. Put the data in order
  2. Find the middle score.
  3. If the data set has an even number of scores, the median is the average of the two middle scores.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The Mode

A

is the most common value in a data set and is easily identifiable as the highest frequency in a frequency distribution table or a histogram. Unlike the mean and the median, the mode must be a score in your data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The median is the preferred measure of central tendency in two main scenarios:

A
  1. Data containing extreme outliers or heavily skewed data
    With extreme outliers or heavily skewed data, the median provides a more accurate representation of the data set than the mean.
  2. Ordinal data, meaning the data is not properly numerically measured but has an order.
    With ordinal data, the median allows for a measure of central tendency when a numerical sum (and therefore the mean) cannot be calculated.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The Geometric Mean

A

is a type of average that is calculated by multiplying all the numbers in a set together, then taking the nth root, where ‘n’ is the total number of values. It’s especially useful in situations involving proportional growth or rates of return, but can only be used with positive numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Geometric Mean Formula

A

μ geometric=[(1+R1)(1+R2)(1+Rn)]^1/n −1
where: ∙R1…Rn are the returns of an asset (or other
observations for averaging).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Geometric Mean may be more accurate than the arithmetic mean for calculating average rates of return, growth rates, or compounding interest rates. However,

A

for estimating future rates of return, the arithmetic mean is more appropriate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Variability measures

A

such as range, variance, and standard deviation provide additional insights into how spread out data is around the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The Range

A

It’s simply the difference between the maximum and minimum values in the data. However, this also highlights the limitation of the range, as it only considers the extreme values and ignores the rest of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Range formula

A

range = largest observation - smallest observation

  • does not consider all data in its calculation
  • very sensitive to extreme values in data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The Empirical Rule

A

also known as the 68-95-99.7 rule, states that for a normal distribution, nearly all data will fall within three standard deviations of the mean. Specifically, 68% of data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Deviations

A

∑(x - µ ) = 0 (MUST ALWAYS = 0)

17
Q

Variance (avg squared deviation)

A

σ2 = (x - µ )^2 then -> σ2 = ∑(x - µ )^2 / N (POPULATION) or σ2 = ∑(x - x̄ )^2 / n - 1 (SAMPLE)

18
Q

The Empirical Rule, also known as the 68-95-99.7 rule, states that

A

for a normal distribution, nearly all data will fall within three standard deviations of the mean. Specifically, 68% of data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

19
Q

Chebysheff’s Theorem:

A

The proportion of observations in any sample or population that lie within k standard deviations of the mean is at least

1 - 1/ k^2

k = # of standard deviations

Note - This applies to any shape of distribution not just a bell

K must be > 1

20
Q

Coefficient-of-Variation