Module 3 Notes - Numerical Descriptive Measures Flashcards

1
Q

The _______ ________ is the extent to which the values of a numerical variable group around a typical or central value.

A

central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the _________ is the amount of dispersion or scattering away from a central value that the values of a numerical variable show

A

variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

the _____ is the pattern of a distribution of values from the lowest to the highest value

A

shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Arithmetic Mean

A

A= \frac {1}{n} \sum \limits_{i=1}^n a_i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Middle value in the ordered array

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Most frequently observed value

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

the __________ ____ (often just called “mean”) is the most common measure of central tendency.
*For a sample of size n (lower case n):

A

arithmetic mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

*The most common measure of _______ ________.
*____ = sum of values divided by the number of values
*Affected by extreme values (outliers).

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

*In an ordered array, the ______ is the “middle number (50% above, 50% below)
*less sensitive than the mean to extreme values

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Locating the Median
*The location of the median when the values are in numerical order (smallest to largest):
*If the number of values is odd, the media is the middle number
*If the number of values is even, the media is the average of the two middle numbers

A

Median Position = n+1/2 position in the ordered data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

*Value that occurs most often
*Not affected by extreme values.

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Range, Variance, Standard Deviation, Coefficient of Variation
-Measures of _________ give information on the spread or variability or dispersion of the data values

A

Measures of Variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

*Simplest measure of variation.
*Difference between the largest and smallest value

A

Range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

*Does not account for how the data are distributed.
*Sensitive to outliers

“Why the _____ can be misleading”

A

range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

*Average (Approx.) of squared deviations of values from the mean.

A

Sample Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

*Most commonly used measure of variation.
*Shows variation about the mean.
*Is the square root of the variance.
*Has the same units as the original data.

A

Sample standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Steps for computing _________ _________
1. Computer the difference between each value and the mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample variables.
5. Take the square root of the sample variance to get the sample ________ _________

A

standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

*Measures relative variation.
*Always in percentage (%)
*Shows variation relative to mean.
*Can be used to compare the variability of two or more sets of data measured in different units.

A

The Coefficient of variation (Standard Deviation / Mean) * 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Locating Extreme Outliers: _-_____
Z=X-x̄/S
Where X represents the data value
x̄ is the sample mean
S is the sample standard deviation

A

Z-score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

*Suppose the mean math SAT score is 490, with a standard deviation of 100.
*Computer the Z-score for a test score of 620.
(Z=x-x̄/s)=(620-490/100)=(130/100)=1.3
-A score of 620 is 1.3 standard deviations above the mean and would not be considered an outlier.
*A data value is considered an extreme outlier if its Z-score is less than -3.0 or greater than +3.0

A

Z-score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The more data are spread out, the greater the _____, ________, and ________ __________.

A

range, variance, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The more data are concentrated, the smaller the _____, ________, and ________ _________.

A

range, variance, and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

If the values are all the same (no variation) all these measures will be zero

A

range, variance, and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

None of these measures are ever in negative.

A

range, variance, and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
The larger the absolute value of the _-_____, the farther the data value is from the mean.
Z-score
26
*Measures the extent to which data values are not symmetrical.
Skewness
27
*Measures the peakedness of the curve of the distribution -that is how sharply the curve rises approaching the center of the distribution.
Kurtosis
28
Measures the extent to which data is not symmetrical.
Skewness
29
Mean < Median
Left-Skewed
30
Mean = Median
Symmetric
31
Median < Mean
Right Skewed
32
Sharper peak than bell-shaped (Kurtosis > 0)
Leptokurtic
33
Bell-shaped (Kurtosis = 0)
Mesokurtic
34
Flatter than bell-shaped (Kurtosis < 0)
Platykurtic
35
*Can visualize the distribution of the values for a numerical variable by computing: *The _________ *The five-number _______. *Constructing a _______.
quartiles, summary, boxplot
36
_________ split into 4 segments with an equal number of values per segment.
Quartiles
37
*the first ________ Q1, is the value for which 25% of the values are smaller than 75% are larger.
quartile
38
*Q_ is the same as the media (50% of the values are smaller and 50% are larger).
Q2
39
*Only 25% of the values are greater than the third quartile.
40
Find a ________ by determining the value in where the appropriate position in the ranked data
quartile
41
_____ quartile position: Q1 = (N+1)/4 where n is the number of observed values
First
42
______ quartile position: Q2=(n+1)/2 where n is the number of observed values
Second
43
_____ quartile position: Q3 = 3(n+1)/4 where n is the number of observed values
Third
44
The ___ is Q3-Q1 and the measures the spread in the middle 50% of the data.
IQR
45
The ___ is also called the midspread because it covers the middle 50% of the data.
IQR
46
-measure of variability that is not influenced by outliers or extreme values.
IQR
47
Measures like Q1, Q3, and IQR that are not influenced by outliers are called _________ _______.
resistant measures
48
*Range is the difference between the smallest values *IQE is
Q3-Q1
49
The five numbers that describe center, spread, and shape of data are: *Xsmallest *First Quartile (Q1) *Median (Q2) *Third Quartile (Q3) *Xlargest
Five number Summary
50
The _______: A graphical display of the data based on the five-number summary
Boxplot, Xsmallest -- Q1 -- Median -- Q3 -- Xlargest
51
(If the data are symmetric around the median then the box and central line are centered between the endpoints *A _______ can be shown in either a vertical or horizontal orientation
Boxplot
52
*The __________ mean is the sum of the values in the population (not the sample) divided by the population size, N (not the sample size)
Population mean
53
μ
population mean
54
Population mean equation: N
Population size (Capital N)
55
Population mean equation: Xi
ith value of the variable X
56
Average of squared deviation of values from the population mean.
Population variance.
57
*Most commonly used measure of variation. *Shows variation about the mean. *Is the same square root of the population variance. *Has the same units as the original data.
The Standard Deviation σ
58
Mean: μ Variance: σ^2 Standard Deviation: σ
Population Parameter Measure
59
Mean: X Variance: S^2 Standard Deviation: S
Sample Statistic Measure
60
*The _________ ____ approximates the variation of data in a symmetric mound-shaped distribution. *Approximately __% of the data in a symmetric mound shaped distribution is within 1 standard deviation of the mean or μ ± 1 σ
The Empirical Rule
61
approximately __% of the date in a symmetric mound-shaped distribution lies within two standard deviations of the mean, or μ ± 2σ
95
62
approximately __% of the date in a symmetric mound-shaped distribution lies within three standard deviations of the mean, or μ ± 3σ
99.7
63
______ plots allow you to visually examine the relationship between two numerical variables and now we will discuss two quantitative measures of such relationships. *The Covariance *The Coefficient of Correlation.
Scatter plots
64
*The ___________ measures the strength of the linear relationship between two numerical variables (X&Y)