r_ggplot Flashcards
What does the “gg” in “ggplot” stand for?
- Grammar Graphics
How do you import ggplot
- library(ggplot2)
- What is one limitation of ggplot
- Works exclusively with data tables
- In these data tables:
- rows have to be observations
- columns have to be variables
What happens if you do the following?
- murders %>% ggplot()
- This renders a blank plot, since no geometry has been defined.
What is Median Absolute Deviation
- mad(x)
- robust measure of central tendency
-
not sensitive to the presence of outliers
- unlike standard mean and standard deviation
How can you use exploratory data analysis to detect that an error was made?
- A boxplot, histogram, or qq-plot would reveal a clear outlier
Given the following:
- x <- Galton$child
Do the following:
- Write a function called error_avg that takes a value k and returns the average of the vector x after the first entry changed to k.
- Show the results for k=10000 and k=-10000.
- error_avg <- function(k){
x[1] <- k
return(mean(x))
}
error_avg(10000)
error_avg(-10000)
What are layers in ggplot?
- In ggplot, graphs are created by adding layers
- They are added component by component
- Layers can:
- define geometries
- compute summary statistics
- define what scales to use
- even change styles
- To add layers, we use a symbol plus
Create a ggplot scatter plot with the following:
- “murders” dataset
- x-axis: population/10^6
- y-axis: total
murders %>% ggplot() +
geom_point( aes( x = population/10^6, y = total) )
What are the functions to add labels to the x and y axis’, and add a title to the plot?
- x-axis label: xlab(“<label>”)</label>
- y-axis label: ylab(“<label>”)</label>
- plot title: ggtitle(“<title>")</title>
How do you add a color to a category of a variable, such as region? Example
- We have to use a mapping
- To map each point to a color, we need to use aes
- geom_point( aes( col=region), size = 3)
How would you add a line to the plot with the following characteristics:
- Dashed
- Goes through log10(r)
- Darkgrey color
- geom_abline( intercept = log10(r), lty = 2, color = “darkgrey”)
How do you do the following:
- Change the legend label from “region” to “Region”
- scale_color_discrete(name = “Region”)
How would you do the following:
- change the plot them to “theme_economist”
- library(ggthemes)
- theme_economist()
Create a plot of the following:
- data: male heights from the heights dataset
- bin width: 1
- color: bars blue with black border
- label: (x-axis) “Male heights in inches”
- title: (plot) “Histogram”
p <- heights %>%
+ filter(sex==”Male”) %>%
+ ggplot(aes (x = height) )
+ geom_histogram(binwidth = 1, fill = “blue”, col = “black”)
+ xlab(“Male heights in inches”)
+ ggtitle(“Histogram”)