r_basics Flashcards
Load the ‘nycflights’ data frame?
data(nycflights)
View the names of the variables for the ‘nycflights’ data frame?
names(nycflights)
View the names of variables AND data types for the ‘nycflights’ data frame?
str(nycflights)
What are the two ways to assign the carrier variable for the nycflights data frame to a variable ‘a’?
a <- nycflights$carrier
a <- nycflights[[“carrier”]]
Assign a day of the week (“Monday” - “Friday”) to each element in the following vector:
- poker_vector <- c(140, -50, 20, -120, 240)
- names(poker_vector) <- c(“Monday”, “Tuesday”, “Wednesday”, “Thursday”, “Friday”)
Given the following vector:
- poker_vector <- c(140, -50, 20, -120, 240)
Assign the middle three values to a new variable “poker_midweek”
poker_midweek <- poker_vector[c(2,3,4)]
Given the following vector:
- roulette_vector <- c(-24, -50, 100, -350, 10)
- Assign to “roulette_selection_vector” the roulette results from Tuesday up to Friday (values 2 - 5)
roulette_selecetion_vector <- roulette_vector[2:5]
Given the following:
- poker_vector <- c(140, -50, 20, -120, 240)
- days_vector <- c(“Monday”, “Tuesday”, “Wednesday”, “Thursday”, “Friday”)
- names(poker_vector) <- days_vector
Select the first three elements in poker_vector by using their names: “Monday”, “Tuesday”, and “Wednesday
Assign the result of the selection to poker_start.
poker_start <- poker_vector[c(“Monday”, “Tuesday”, “Wednesday”)]
Get the length of the variable (column) carriers in the nycflights data frame
pop <- nycflights$population
length(pop)
Check if the variables ‘a’ and ‘b’ and identical
identical(a,b)
In a nested way, determine the number of regions defined by this dataset and contained in murders$region.
length(levels(murders$region))
Use the table function in one line of code to create a table showing the number of states per region in the murders data set.
table(murders$region)
View the first five lines of the data frame?
head(nycflights)
What are the seven functions for the ‘dplyr’ package?
- filter()
- arrange()
- select()
- distinct()
- mutate()
- summarise()
- sample_n()
Use ‘ggplot’ function to plot the ‘dep_delay’ variable from the ‘nycflights’ data frame with a bin width of 150
ggplot(data = nycflights, aes(x = dep_delay)) + geom_histogram(binwidth = 150)