Review of 2300 Flashcards
(23 cards)
Measures of Variability
It is often desirable to consider measures of variability (dispersion), as well as measures of location.
For example, in choosing supplier A or supplier B we might consider not only the average delivery time for each, but also the variability in delivery time for each.
Measures of Variability
Range
Variance
Standard Deviation
Coefficient of the variation
Range
The range of a data set is the difference between the largest and smallest data values
It is the simplest measure of variability.
It is very sensitive to the smallest and largest data values.
Range = largest value - smallest value
Variance
The variance is a measure of variability that utilizes all the data.
It is based on the difference between the value of each observation (xi) and the mean ( x bar for a sample, mu for a population).
The variance is the average of the squared
differences between each data value and the mean.
Standard Deviation
The standard deviation of a data set is the positive square root of the variance.
It is measured in the same units as the data, making it more easily interpreted than the variance.
In statistics and probability theory, standard deviation (represented by the symbol sigma, σ) shows how much variation or dispersion exists from the average (mean), or expected value. A low standard deviation indicates that the data points tend to be very close to the mean; high standard deviation indicates that the data points are spread out over a large range of values.
Relative Frequency
Relative Frequency Distribution
The relative frequency of a class is the fraction or proportion of the total number of data items belonging to the class.
A relative frequency distribution is a tabular
summary of a set of data showing the relative frequency for each class.
Percent Frequency
Percent Frequency Distribution
The percent frequency of a class is the relative frequency multiplied by 100
A percent frequency distribution is a tabular
summary of a set of data showing the percent
frequency for each class.
Frequency Distribution
Guidelines for Selecting Width of Classes
Approximate Class Width =
Largest data value - Smallest data value / Number of classes
Cumulative frequency distribution -
Cumulative frequency distribution - shows the number of items with values less than or equal to the upper limit of each class
Cumulative relative frequency distribution – shows the proportion of items with values less than or equal to the upper limit of each class
Cumulative percent frequency distribution – shows the percentage of items with values less than or equal to the upper limit of each class.
Measures of Location
If the measures are computed for data from a sample,they are called sample statistics.
If the measures are computed for data from a population,they are called population parameters
A sample statistic is referred to as the point estimator of the corresponding population parameter
Mean
The mean of a data set is the average of all the data values.
The sample mean (xbar) is the point estimator of the population mean (mu).
Measures of variability
It is often desirable to consider measures of variability (dispersion), as well as measures of location.
For example, in choosing supplier A or supplier B we might consider not only the average delivery time for each, but also the variability in delivery time for each
Forms of measures of variability
Range Variance Standard Deviation Coefficient of variation Interquartile range
Range
The range of a data set is the difference between the largest and smallest data values.
It is the simplest measure of variability
It is very sensitive to the smallest and largest data values
Variance
The variance is a measure of variability that utilizes all the data.
It is based on the difference between the value of each observation (xi) and the mean
(xbar for a sample,
mu for a population).
The variance is the average of the squared
differences between each data value and the mean.
Standard Deviation
The standard deviation of a data set is the positive square root of the variance.
It is measured in the same units as the data, making it more easily interpreted than the variance.
Coefficient of variation
The coefficient of variation indicates how large the standard deviation is in relation to the mean.
Distribution Shape: Skewness
Symmetrical, Skewness = 0, median and mean are the same, perfect bell curve
Moderately skewed to the left
mean will be less than the median
Moderately skewed to the right
mean will be more than the median
Z-Score
The z-score is often called the standardized value.
It denotes the number of standard deviations a data value xi is from the mean
An observation’s z-score is a measure of the relative location of the observation in a data set.
A data value less than the sample mean will have a z-score less than zero
A data value greater than the sample mean will have a z-score greater than zero
A data value equal to the sample mean will have a z-score of zero
Empirical Rule
For data having a bell-shaped distribution:
- 26% of the values of a normal random variable are +/- 1 standard deviations of the mean
- 44T of the values of a normal random variable are +/- 2 standard deviations of the mean
- 72% of the values of a normal random variable are +.- 3 standard deviations of the mean
Covariance
The covariance is a measure of the linear association between two variables.
Positive values indicate a positive relationship
Negative values indicate a negative relationship
Correlation Coefficient
Correlation is a measure of linear association and not necessarily causation
Just because two variables are highly correlated, it does not mean that one variable is the cause of the other
The coefficient can take on values between -1 and +1
Values near -1 indicate a strong negative linear relationship
Values near +1 indicate a strong positive linear relationship
Steps of Hypothesis Testing:
Step 1. Develop the null and alternative hypothesis
Step 2. specify the level of significance
Step 3. Collect the sample data and compute the value of the test statistic
p-value approach:
Step 4. Use the value of the test statistic to compute the p-value
Step 5. reject the null hypothesis if the p-value is less than or equal to sigma (the standard deviation)
Critical Value Approach:
Step 4. Use the level of significance to determine the critical value and the rejection rule
Step 5. Use the value of the test statistic and the rejection rule to determine whether to reject the null hypothesis.