1.1.2 Flashcards

(20 cards)

1
Q

What does the %>% symbol do?

A

Takes the output on the LHS and inputs it on the RHS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an example of the %>% symbol in use?

A

library(tidyverse)
starwars2 %>%
summary()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give 2 examples to show the difference between inside-out and left-to-right code

A

Inside-out
summary(as.factor(starwars2$homeworld))

Left-to-right
starwars2$homeworld %>%
as.factor() %>%
summary()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does summary() provide for continuous variables?

A

Numeric descriptions of the distribution of values in each variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the distribution of a variable show?

A

The frequency of different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a frequency distribution?

A

An overview of all values in some variable and how many times they occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can we access the frequencies of different response levels of a variable in a dataframe?

A

dataframe %>%
count(variable name)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the mode and what type of data is it used to measure central tendency for?

A

The most frequent value
Unordered categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is relative frequency distribution, what does it show, and how can it be written?

A

Percentage of respondents in each category
Proportion of times each value occurs
Decimals, fractions, percents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are two ways we can calculate the relative frequency distributions in a dataframe?

A

Make a frequency table
freq_table$n/sum(freq_table$n)
OR
freq_table <-
dataframe %>%
count(variable name) %>%
mutate(
prop = n/sum(n)
)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can we plot values in a bar chart?

A

ggplot(data = dataframe, aes(x = entry, y = entry)) +
geom_col()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can we change the axis labels?

A

labs(title = “entry”, x = “entry”, y = “entry”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can we make a scatterplot?

A

ggplot(data = dataframe)
geom_point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can we change the limits of the axis?

A

ylim(min, max)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can we remove the legend?

A

theme(legend.position = “none”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can we add the percentage and cumulative percentage of each response to a table?

A

dataframe %>%
count(entry) %>%
mutate(
percent = n/sum(n)*100
cumulative_percent = cumsum(percent)
)

17
Q

What is the median?

A

The middle value

18
Q

What does count() do?

A

Counts the number of occurrences of each unique value in a variable

19
Q

What does mutate() do?

A

Adds new variables/modifies existing variables in a dataframe

20
Q

What do min() and max() do?

A

Return the minimum/maximum value of a variable