The Color and the Shape Flashcards
Higher dimensions
Four other common dimenstion treatments for scatter plots are:
- Color
- Size
- Transparency
- Shape
Higher dimensions
Four other common dimenstion treatments for line plots are:
- Color
- Thickness
- Transparency
- Line type (solid, dashes, dots)
Higher dimensions
Another way to separate out another dimension is to use p_____.
Panels
Small plots of the same type
Higher dimensions
3D plots are often a bad idea because …
Trying to show 3 dimensions in 2D often causes confustion and lack of clarity.
Higher dimensions
Why might using size or transparency not work well for a scatter plot?
It’s often hard, if not imposssible, to distinguish overlapping data points.
Using color
What are the three main color spaces?
- RBG (Red, Grean, Blue)
- CMYK (Cyan, Magenta, Yellow, Black)
- HCL (Hue, Chroma, Luminance)
Using color
RGB stands for
Red, Grean, Blue
Using color
CMYK stands for
Cyan, Magenta, Yellow, Black
Using color
HCL stands for
Hue, Chroma, Luminance
Using color
The colorspace best used in data viz is:
HCL
(Hue, Chroma, Luminance)
Using color
The three types of color scale usage for data viz are:
- Qualitative
- Sequential
- Diverging
Using color
A Qualitative color scale uses …
hue to distinguish unordered categories.
Using color
A Sequential color scale uses …
a change in chroma or luminance to show ordering.
Using color
A Diverging color scale uses …
a change in chroma or luminance with two hues and is meant to show above or below a midpoint.
Plotting many variables at once
If you have several variables that are a mix of continuous and categorical, what plot type might you use?
A paired plot.
A pair plot is a matrix of plots in rows and columns that are a combination of bar plots, histograms, scatter plots, and correlations.
Plotting many variables at once
Panels on the diagonal of a paired plot show _____________ of variables.
distributions
This could either be a bar plot when the variable is categorical or a histogram if the variable is continuous.
Plotting many variables at once
Panels off the diagonal show ___________ between pairs of variables.
relationships
When both variables are continuous you see scatter plots of each pair of variables and their correlation.
Plotting many variables at once
When should you use a pair plot?
- When you have up to 10 variables (either continuous, categorical, or a mix).
- When you want to see the distribution for each variable.
- When you want to see the relationship between each pair of variables.
Plotting many variables at once
In a pair plot, when comparing a categorical variable to a continuous variable you get what kind of plot(s)?
A box plot and a histogram of the continuous variable split by the categorical variable.
Plotting many variables at once
When you have lots of continuous variables and need a simple overview of their relationships, what plot type might you use?
A correlation heat map.
Plotting many variables at once
When should you use a correlation heat map?
- When you have lots of continuous variables.
- When you want a simple overview of how each pair of variables is related.
Plotting many variables at once
Each intersection of a correlation heat map shows what?
The correlation value between the continuous variables.
Plotting many variables at once
When you have lots of continuous variables and want to find patterns, what plot type might you use?
A parallel coordinates plot
Plotting many variables at once
When should you use a parallel coordinates plot?
- When you have lots of continuous variables.
- When you want to find patterns across these variables,
- Or when you want to visualize clusters of observations.