Tidyverse Flashcards

1
Q

What’s the Tidyverse package that provides ways to ingest rectangular data? What are the functions used to accomplish it and what class of object they do they return?

A

readr is the package. These are the functions, they all return tibble.

read_delim(): general delimited files

read_csv(): comma separated (CSV) files

read_tsv(): tab separated files

read_fwf(): fixed width files

read_table(): tabular files where columns are separated by white-space.

read_log(): web log files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 8 core packages in Tidyverse and what’s their purpose?

A
  • ggplot2 - create graphics
  • dplyr - for data manipulation.
  • tidyr - helps to create tidy data
  • readr - read rectangular data
  • purrr - extends R Functional Programming (vec funs)
  • tibble - a modern and enhanced data frame
  • stringr - functions for string manipulation
  • forcats - tools to work with factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can I create a simple tibble like this one?

A tibble: 3 x 2
name age
1 Lucas 12
2 Jose 45
3 Tales 42

A

Same as data frames: I need to call function tibble( ) providing vectors:

df1 = tibble(name=c(“Lucas”, “Jose”, “Tales”),
age=c(12,45,42)
)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a tibble? What’s the class and type of a tibble object?

A

Tibble is an enhanced data frame, which makes it easier to work with tidy and consistent data. It is the central data structure of Tidyverse.

A tibble is of type list and class tbl_df which is a subclass of data.frame, with different default behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can I convert a data frame into a tibble? What’s the package being used?

A

Using function as_tibble from tibble package:

as_tibble(mtcars)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can I sort tibble df1 by column age? Which package is being used?

A

Using function arrange from package dplyr:

df1 %>%
arrange(age)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can I sort tibble df1 by column age in descending order? Which package is being used?

A

Using functions arrange and desc from package dplyr:

df1 %>%
arrange(desc(age))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can I select values only for columns “mpg”, “cyl” and “gear” of tibble cars? What’s the pacakge involved?

A

Using function select from package dplyr:

cars %>%
select(c(“mpg”, “cyl”, “gear”)) # with vector

or

cars %>%
select(mpg, cyl,gear) # without vector and quotes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can I select rows from 2 to 5 in my tibble df1? What’s the required package?

A

Using function slice from dplyr:

df1 %>%
slice(2:5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Is it recommended working with row names in tibbles? Why?

A

Generally, it is best to avoid row names, because they are basically a character column with different semantics than every other column. They are removed by tibble when subsetting with the [ operator.

Tibble provides functions to convert row names to an explicit column and vice versa, if needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly