Week 3 Chapter 4 Flashcards
(58 cards)
sample variance
s^2 = SS/(n-1)
population variance
σ^2 = SS/N
population standard deviation
σ = √(σ^2) = √((SS)/N)
sample standard deviation
s = √(s^2) = √((SS)/(n-1))
central tendency
statistical measure to determine a single score that defines the midpoint of a distribution
median
midpoint in a list of scores listed in order from smallest to largest
mode
score or category that has the greatest frequency in a frequency distribution
bimodal
distribution with two scores with greatest frequency
multimodal
a distribution with more than two scores with greatest frequency
major mode
taller peak when two scores with greatest frequency have unequal frequencies
minor mode
shorter peak when two scores with greatest frequency have unequal frequencies
line graph
diagram used when values on horizontal axis are measured on an interval or ratio scale
variability
provides a quantitative measure of the differences between scores in a distribution and describes the degree to which the scores are spread out or clustered together. It also helps us determine which outcomes are likely and which are very unlikely to be obtained. This aspect of variability will play an important role in inferential statistics. Variability can also be viewed as measuring predictability, consistency, or even diversity.
if the scores in a distribution are all the same, then there is
no variability. If there are small differences between scores, then the variability is small, and if there are large differences between scores, then the variability is large.
predictability, consistency, and diversity are all concerned with
the differences between scores or between individuals, which is exactly what is measured by variability.
a good measure of variability serves two purposes
- Variability describes the distribution of scores. Specifically, it tells whether the scores are clustered close together or are spread out over a large distance. Usually, variability is defined in terms of distance. It tells how much distance to expect between one score and another, or how much distance to expect between an individual score and the mean. For example, we know that the heights for most adult males are clustered close together, within 5 or 6 inches of the average. Although more extreme heights exist, they are relatively rare.
- Variability measures how well an individual score (or group of scores) represents the entire distribution. This aspect of variability is very important for inferential statistics, in which relatively small samples are used to answer questions about populations. For example, suppose that you selected a sample of one adult male to represent the entire population. Because most men have heights that are within a few inches of the population average (the distances are small), there is a very good chance that you would select someone whose height is within 6 inches of the population mean. For men’s weights, on the other hand, there are relatively large differences from one individual to another. For example, it would not be unusual to select an individual whose weight differs from the population average by more than 30 pounds. Thus, variability provides information about how much error to expect if you are using a sample to represent a population.
three different measures of variability:
the range, standard deviation, and the variance.
range
the distance covered by the scores in a distribution, from the smallest score to the largest score.
One commonly used definition of the range simply measures the difference between the largest score (Xmax)
and the smallest score (Xmin). Range = Xmax - Xmin. By this definition, scores having values from 1 to 5 cover a range of 4 points.
the complete set of proportions is bounded by 0 at one end and
by 1 at the other. The proportions cover a range of 1 point. This definition works well for variables with precisely defined upper and lower boundaries. For example, if you are measuring proportions of an object, like pieces of a pizza, you can obtain values such as (1/8), (1/4), (1/2), (3/4). Expressed as decimal values, the proportions range from 0 to 1. You can never have a value less than 0 (none of the pizza) and you can never have a value greater than 1 (all of the pizza).
An alternative definition of the range is often used when the scores are measurements of a continuous variable. In this case, the range can be defined as the
difference between the upper real limit (URL) for the largest score (Xmax) and the lower real limit (LRL) for the smallest score (Xmin). Range = URL for Xmax - LRL for Xmin. According to this definition, scores having values from 1 to 5 cover a range of 5.5 - 0.5 = 5 points .
When the scores are whole numbers, the range can also be defined as the
number of measurement categories. If every individual is classified as either 1, 2, or 3 then there are three measurement categories and the range is 3 points.
Defining the range as the number of measurement categories also works for discrete variables that are measured with
numerical scores. For example, if you are measuring the number of children in a family and the data produce values from 0 to 4, then there are five measurement categories (0, 1, 2, 3, and 4) and the range is 5 points. By this definition, when the scores are all whole numbers, the range can be obtained by: Xmax - Xmin + 1
The problem with using the range as a measure of variability is that
it is completely determined by the two extreme values and ignores the other scores in the distribution. Thus, a distribution with one unusually large (or small) score will have a large range even if the other scores are all clustered close together. Because the range does not consider all the scores in the distribution, it often does not give an accurate description of the variability for the entire distribution. For this reason, the range is considered to be a crude and unreliable measure of variability. Therefore, in most situations, it does not matter which definition you use to determine the range.