Chapter 2: Organizing and Summarizing Data Flashcards
(33 cards)
What is frequency distribution?
Frequency Distribution lists each category of data and the number of occurrences for each category of data.
What does a frequency distribution table look like?
What is relative frequency?
Relative Frequency is the proportion (or percent) of observations within a category and is found using the formula:
frequency/sum of all frequencies = relative frequency
How is a relative frequency table organized?
A relative frequency distribution table lists each category of data with the relative frequency.
What is a bar graph?
A bar graph is constructed by labeling each category of data on either the horizontal or vertical axis and the frequency or relative frequency of the category on the other axis.
Rectangles of equal width are drawn for each category. The height of each rectangle represents the category’s frequency or relative frequency.
What is a Pareto Chart?
A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency
What graph should you use to compare data?
A side by side bar graph
Data sets should be compared using relative frequencies, because different sample or population sizes make comparisons using frequencies difficult or misleading.
Can Bar Graphs be drawn horizontally?
Yes, horizontal bars are preferred when category names are lengthy.
What is a Pie Chart?
A pie chart is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.
What is the first step to summarizing quantitative data on a display?
The first step in summarizing quantitative data is to determine whether the data are discrete or continuous.
*Discrete: can be counted *Continuous: is measured.
Discrete and few different values of the variable - the categories of the data (classes) will be the observations (as in qualitative data)
Continous Data - the categories of data (classes) must be created using intervals of numbers. Also applies if there are too many variables in discrete data.
What is a histogram?
A histogram is constructed by drawing rectangles for each class of data.
The height of each rectangle is the frequency or relative frequency of the class.
The width of each rectangle is the same and the rectangles touch each other.
What does constructing Frequency and Relative Frequency Distribution from Discrete Data look like?
EXAMPLE: The following data represent the number of available cars in a household based on a random sample of 50 households. Construct a frequency and relative frequency distribution.
3 0 1 2 1 1 1 2 0 2
4 2 2 2 1 2 2 0 2 4
1 1 3 2 4 1 2 1 2 2
3 3 2 1 2 2 0 3 2 2
2 3 2 1 2 2 1 1 3 5
What are classes?
Classes are categories into which data are grouped.
*When a data set consists of a large number of different discrete data values or when a data set consists of continuous data, we must create classes by using intervals of numbers.
When reading a Data Table -
What is a Lower Class Limit?
What is a Upper Class Limit?
What is the Class Width?
lower class limit of a class is the smallest value within the class
upper class limit of a class is the largest value within the class
class width is the difference between consecutive lower class limits
Example from chart:
Lower class limit of 1st class = 25
Upper class limit of 1st class = 34
Class width between categories 35-25=10
Example of organized continuous data on table?
Example of continuous data on table:
The following data represent the time between eruptions (in seconds) for a random sample of 45 eruptions.
The smallest data value is 672 and the largest data value is 738. We will create the classes so that the lower class limit of the first class is 670 and the class width is 10 and obtain the following classes:’
670 − 679
680 − 689 etc.
What are the guidelines for determining the Lower Class Limit of the First Class and Class Width?
Choosing the Lower Class Limit of the First Class:
Choose the smallest observation in the data set or a convenient number slightly lower than the smallest observation in the data set.
Determining the Class Width
Determine the class width by computing
Largest class number minus smallest class number divided by number of classes.
*round up to convenient number
EXAMPLE Constructing a Frequency and Relative Frequency Histogram for Continuous Data
What is a stem-and-leaf plot?
A stem-and-leaf plot uses digits to the left of the rightmost digit to form the stem. Each rightmost digit forms a leaf.
For example, a data value of 147 would have 14 as the stem and 7 as the leaf.
We let the stem represent the integer portion of the number and the leaf will be the decimal portion. For example, the stem of Alabama (2.8) will be 2 and the leaf will be 8
*Within each stem, rearrange the leaves in ascending order, title the plot, and include a legend to indicate what the values represent.*
What does a split stem and leaf plot look like?
What is an Advantage of Stem-and-Leaf Diagrams over Histograms?
Once a frequency distribution or histogram of continuous data is created, the raw data is lost (unless reported with the frequency distribution), however, the raw data can be retrieved from the stem-and-leaf plot.
What is a dot plot?
A dot plot is drawn by placing each observation horizontally in increasing order and placing a dot above the observation each time it is observed.
EXAMPLE Drawing a Dot PlotThe following data represent the number of available cars in a household based on a random sample of 50 households. Draw a dot plot of the data.
3 0 1 2 1 1 1 2 0 2 4 2 2 2 1
2 2 0 2 4 1 1 3 2 4 1 2 1 2 2 3
Identify the Shape of a Distribution
Uniform Distribution?
Bell-Shaped?
Skewed right?
Skewed Left?
Uniform distribution - the frequency of each value of the variable is evenly spread out across the values of the variable
Bell-shaped distribution - the highest frequency occurs in the middle and frequencies tail off to the left and right of the middle
Skewed right - the tail to the right of the peak is longer than the tail to the left of the peak
Skewed left - the tail to the left of the peak is longer than the tail to the right of the peak.
What is a Time-Series Plot?
A time-series plot is obtained by plotting the time in which a variable is measured on the horizontal axis and the corresponding value of the variable on the vertical axis.
Line segments are then drawn connecting the points.
*If the value of a variable is measured at different points in time, the data are referred to as time series data
What is descriptive statistics?
Descriptive statistics consists of organizing and summarizing data.