chapter 2: graphical descriptions of data Flashcards
frequency tables
-typically the different categories are different rows
-one column might be frequency counts
-sometimes we replace or add a relative frequency column
bar graphs or chart
-put the frequency on the vertical axis and the category on the horizontal
-excel and other tools make this easy
what are key features on a bar graph
- equal spacing on each axis
- bars are the same width
- there should be labels on each axis and a title for the graph
- the bars do not touch
- start at 0 unless you do not need to
- there should be a scaling on the frequency axis and the categories should be listed on the category axis
relative frequency bar graphs
-you can draw a bar graph using relative frequency on the vertical axis
-useful for when you want to compare two samples with different sample sizes
-relative frequency and frequency graph should look the same besides scaling on the frequency axis
pie chart
-these became popular when inexpensive computer graphics became available in spreadsheets
-the mechanics of converting frequencies to angles has been automated
color and graphs
-be wary of using color. many presentations end up being printed in black and white
-avoid color legends that make it very hard to match up topics to their areas
pareto charts
-a type of qualitative data graph, which is just a bar chart with the bars sorted with the highest frequencies on the left
histogram
- start by making a frequency distribution (we divide up the range of data into frequency classes or bins)
-count how many fall into each bin
-bar chart of these features, once difference is that the bars touch
How to make a histogram
- Find the range = largest value- smallest value
- Pick the number of classes to use (usually between 5-20)
- Class width = range/ number of classes (always round to the next integer - including whole numbers)
- Create the classes
- Find the class boundary, subtract 0.5 from lower class limit and add 0.5 to upper class limit
- If useful find class midpoint = (lower limit+ upper limit)/2
- Figure out the number of data points that fall in each class
** for measuring continuous numbers we use “half-open intervals” to create bins
class limits
- start with the min, and add class width
- two approaches
- use the book’s 0.5 approach for whole numbers
-use half-open intervals
frequency histogram
-a bar graph that represents the frequency distribution
- the horizontal scale is quantitative and measures the data values
-the vertical scale measures the frequencies of classes
-consecutive bars must touch
outlier
-a data value that is far from the rest of the values
-may be an unusual value or a mistake
-should be investigated
cumulative frequency distribution
-count the number of data points that are below the upper class boundary , starting with first class and up to the top
-the last upper class boundary should have all the data points below it
-also include the number of data points below the lowest class boundary (is 0)
stem-and-leaf plots
-resulted from making histograms from a typewrite back in the day
time plot
- time on the x axis (horizontal)
- shows growth, periodicity
penultimate
-on a stem and leaf plot the class with no data
time-series plot
-shows the data measurements in chronological order, the data being quantitative
-time goes on the horizontal axis and the other variables on the vertical axis
-plot the ordered pairs and connect the dots
-purpose is to look for trends over time
dot-plot
-each data entry is plotted using a point above the horizontal axis