Data wrangling Flashcards
(7 cards)
What are the rows and columns called in a data frame?
Observations and variables, respectively.
How can you see the number of rows and columns in a tibble when using the dplyr library
They are displayed at the top of the tibble when it is printed.
Which dplyr verb allows you to look at a subset of observations based on a specific condition.
filter()
Ex: filter(my_tibble, country == “United States”)
What is the syntax and definition of a pipe?
%>%
A pipe takes whatever is before it, and feeds it into the next step.
Which dplyr verb sorts the observations in a dataset in ascending or descending order based on one of its variables?
arrange()
Ex: arrange(my_tibble, gdpPerCap)
How can you arrange in descending order?
Add desc() around the variable you are arranging by.
Ex: my_tibble %>%
arrange(desc(gdpPerCap))
Which dplyr verb allows you to change one of the variables in your dataset, or add a new variable?
mutate()
Ex: my_tibble %>%
mutate(pop = pop / 1000000)
Ex: my_tibble %>%
mutate(gdp = gdpPerCap * pop)