R Flashcards

(444 cards)

1
Q

In R’s lattice, makes plots show up Top-> Bottom,

Left -> Right?

A

…, as.table = TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

dplyr version of:

merge(x, y, all.x=T, all.y = T)

A

full_join(x, y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

with stringr, return the 1st match for a regex?

A

str_extract(str, regex)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

with stringr, replace each vowel in x with “-“?

A

str_replace_all(x, “[aeiou]”, “-“)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

with stringr, replace 1 with one and 2 with two in x?

A

str_replace_all(x, c(1 = “one”, 2 = “two))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

with stringr, return all matches in a string for a regex?

A

str_extract_all(x, regex)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

dplyr version of:

merge(x, y)

A

inner_join(x, y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In R’s plot, set number size at tick marks?

A

plot(…, cex.axis = number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

do with join operation:
flights %>%
filter(dest %in% top_dest$dest)

A

flights %>%

semi_join(top_dest)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In R, after xaxt = “n”, add ticks for the years 2008 and 2016?

A

axis.Date(1,
at = c(as.Date(“1/1/2008”), as.Date(“1/1/2016”)),
label = c(“2008”, “2009”))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In R, set the outer margin to leave 2 lines for text on Top and add “Title” there?

A

par(oma = c(0, 0, 2, 0))

mtext(“title”, outer = T)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

With stringr, treat na as string?

A

str_replace_na()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

With stringr, turn myVec (a vector) into one long string with no spaces?

A

str_c(myVec, collapse = “”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In R’s plot(), set point type?

A

plot(…, pch = [0:255])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In R, add a surrogate key to dat?

A

dat %>%

mutate(surrogate_key = row_number())

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Confirm tailnum is the primary key in planes in R?

A

planes %>%
count(tailnum) %>%
filter(n > 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In base R, what function is critical to unique arrangements of plots?

A

layout()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In R, add a line for a linear model with y & x?

A

abline(lm(y ~ x))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In R’s plot(), set axis label size?

A

plot(…, cex.lab = #)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

stringr’s function to filter string matches?

A

str_subset()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Make tidy with tidyr:
tablea

country ‘99’ ‘00’ ‘01’
A x, y, z
B …
C

A

tablea %>%

gather(‘99’:’01’, key = ‘year’, value = ‘measure’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

with stringr, return the 1st match in each sentence?

sentences %>%
_______(“(a|the) ([^ ]+)”)

A

str_match

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

in base R, calculate the mean of variables in DAT at each level of FACTOR?

A

by(DAT, FACTOR, FUN = mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

in R, add “label” on the right side of an existing plot in the outer margin?

A

mtext(“label”, 4, outer = TRUE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
In R, create a function that returns hello or goodbye based on user's choice?
myFunc
26
In ggplot, how do you zoom in to the range 0-50 on the y-axis?
... + | coord_cartesian(ylim = c(0, 50))
27
In R, plot x as an overlay to an existing plot in the top right corner?
par(fig = c(0.5, 1, 0.5, 1), new = TRUE) | plot(x)
28
In R, how do you review this layout? nf
layout.show(nf)
29
In R, how can you define a number of regions within the current device that can be treated as separated graphics devices?
split.screen()
30
Why manually call regex() in stringr functions?
``` To set arguments, which include: ignore_case multiline comments dotall ```
31
In R, create box plot from Y and FACTOR with notches?
plot(FACTOR, Y, notches = T)
32
In R, tx data from R to Excel?
1) write.table(data, "clipboard", sep = "\t", co.names = NA) | 2) paste in Excel
33
In R, set orientation of #s on tick marks to always horizontal?
par(las = 1)
34
In R, remove white space before and after string?
trimws(string)
35
In R, sum of each row of matrix X?
rowSums(x)
36
In R, get names in a factor variable?
levels(factor)
37
In R, get # of names in factor variable?
nlevels(factor)
38
In R, reorder names in a factor variable?
factor(factor, levels = c('name1', 'name2'))
39
In R, turn factor names into integers?
as.vector(unclass(factor))
40
In R, set # of digits to 5 for any output?
options("digits" = 5)
41
xyplot(root~week | plant): add a line for a regression?
xyplot(root~week | plant, | -> panel.abline(lm(root~week))
42
In R, generate a q-q plot?
qqnorm()
43
In R, given events A and B, and sample space S, calculate probability of at least A or B occuring?
length(union(A, B)) / length(S)
44
Base R, return X where X is NA?
x[is.na(x)]
45
in R, reverse order of vector Y?
rev(Y)
46
In R, log to base n of x?
log(x, n)
47
In R, return max at each point of vector x?
cummax(x)
48
In R, what is the square root of x?
sqrt(x)
49
In R, difftime() vs. as.difftime()?
difftime() calculates the # of days between dates and as.difftime() creates a time object out of times, not dates.
50
In R, caluclate 25th percentile of x?
quantile(x, 0.25)
51
In R, generate a list of 4 1s, 4 2s, and so on up to 10?
rep(1:10, each = 4)
52
In R, calculate probability of A & B occuring within sample space S?
length(intersect(A, B)) / length (S)
53
In R, what function is equivalent to IF() formula in excel?
ifelse()
54
In R, get 5 items from vector Keys that allowed to grab the same value repeatedly?
sample(Keys, 5, replace = T)
55
Return df's column B?
df$B
56
In R, name m's columns AA, BB, CC?
colnames(m) GETS c("AA", "BB", "CC")
57
In R, probability of only A, not B within sample space S?
length(setdiff(A, B)) / length(S)
58
In R, return a histogram of vector x and then add dashed density lines?
hist(x) | lines(density(x), lty = "dashed")
59
In R, return the dimensions of vector v?
dim(v)
60
In R, return df with the columns medians removed?
sweep(dat, 2, apply(dat, 2, median))
61
In R, test for normality of dat & describe null hypothesis?
shapiro.test(dat) Null hypothesis = normally distributed
62
In R, X is weights and Y is heights. Create a scatter plot with X and Y labels and filled in dots?
plot(X, Y, xlab = "weight", ylab = "height", pch = 16)
63
Calculate the sum of the rows in m, by group?
rowsum(m, group)
64
What is the square root of x ?
sqrt(x)
65
In R, what is the working directory?
getwd()
66
In R, how do you create a list of words out of a string?
strsplit(str, " ")
67
In R, get product of all values in vector X?
prod(x)
68
In R: add function to remove NAs? newDat
na.omit(dat)
69
In R, x
x[which(abs(x-50) == min(abs(x - 50))]
70
In R, return just the means of Dat[,c(1, 2, 3)] by variables var5 and var6?
aggregate(Dat[,c(1,2, 3)], by=list(var5, var6), mean)
71
In R, how do you see data available in the loaded package "UsingR"?
data(package = "UsingR")
72
In R, is it daylight savings right now?
as.POSIXlt(Sys.time())$isdst
73
In R's legend, how do you set the fill color for symbols?
pt.bg = ...
74
In R, set tick marks to be on the inside by the default length?
tcl = 0.5
75
When using dplyr's arrange(), where do missing values end up?
the end
76
dates
strptime(dates, "%d%b%y") | strptime(dates, "\%d\%b\%y")
77
With dplyr, return all columns from flights except year through day?
select(flights, -(year:day))
78
With dplyr, return columns from flights with "ijk" in the name?
select(flights, contains("ijk"))
79
dply command to reorder the rows?
arrange()
80
With dplyr, assign new names to specific columns while returning all columns?
rename()
81
With dplyr, put flights in descending order by distance?
arrange(flights, desc(distance))
82
With dplyr, put flights data in order by year, month and then day?
arrange(flights, year, month, day)
83
In R, find chi-square value for alpha, where x follows chi-square dist with 12 degrees of freedom?
qchisq(0.05, 12, lower.tail=F) I think lower.tail = F is default...
84
In R, command to find what package is qplot is?
find("qplot")
85
In R, what function is useful for mathematical notations inside the plots functions?
expression()
86
In R's lattice package, create a scatter plot for weight vs age given gender?
xyplot(weight ~ age | gender)
87
In ggplot, what grammar does size, shape, color and x/y locations relate to?
aesthetics... aes()
88
In ggplot, 2 ways to facet?
facet_wrap() | facet_grid()
89
What argument to jitter dots in geom_point?
position = "jitter"
90
ggplot(data = mpg, mapping = aes(x = displ, y = hwy) + -> geom_point() vs. ggplot(data = mpg) + -> geom_point(mapping = aes(x = displ, y = hwy))
Same graph, top uses global mapping and bottom uses local mapping
91
... + facet_grid(drv ~ .)
facets plot by drv along a column (up and down)
92
ggplot(data = diamonds) + -> geom_bar(aes(x=cut, fill=clarity) 1) stacked bar chart? 2) 100% stacked bar chart? 3) Grouped bar chart?
position = ... 1) no argument 2) "fill" 3) "dodge"
93
To plot hwy ~ displ from mpg: ggplot(data = mpg) + -> geom_point(? 1 = ? 2(x=displ, y = hwy)
? 1 mapping | ? 2 aes
94
In R's plot(), argument for no tick marks and no #?
xaxt = "n", yaxt = "n"
95
In R, return items exclusive to A as compared to B?
setdiff(A, B)
96
In R, create a sample of 1000 families with 3 children and probability of 0, 1, 2, 3 boys as equal to 1/8, 3/8, 3/8, 1/8?
sample(0:3, size = 1000, prob = c(1/8, 3/8, 1/8, 3/8), replace = T)
97
In R, transpose matrix m?
t(m)
98
In R, show all possible scatterplot dot types?
plot(0:25, pch = 0:25
99
In R, 4 useful function for standard normal distribution?
pnorm(): cum probability dnorm(): probability density qnorm(): quantile function rnorm(): random #s from distribution
100
In R, return Test's attributes?
attributes(Test)
101
In R, how do you time a function?
system.time(functionName())
102
In R, create a QQ plot with a diagonal line for dat?
qqnorm(dat) | qqline(dat)
103
In R, check if file "fname.txt" exists in working directory?
file.exists("fname.txt")
104
A
lapply(list.object, length)
105
In R, with data.frame(DF), typeof(DF)?
"list"
106
In R's data.frame(), suppress factor creation?
stringsAsFactors = F
107
In R, mean of each row of matrix X?
rowMeans(X)
108
In R, what are attributes?
metadata about objects
109
In R's data frame df, what does length(df) return?
the same as ncol(df)
110
In R, set x's "cust_attr" attribute?
attr(x, "cust_attr")
111
In R, what function is useful for running random #s through a formula?
replicate() remake this card...
112
In R, generate a plot a plot on a 3-d plane using vectors x, y, z?
persp(x, y, z)
113
What 3 attributes stay with modified objects?
Names - names() Dimensions - dim() Class - class()
114
In R, describe bootstrap test for testing a mean?
1) Create vector of means based on samples from true data -> x.bar 2) p.val testVal)]) / length(x.bar)
115
In R, given vector A, B, and function F that takes 2 arguments, create an array of dimensions (A, B) that is the result of function(A,B) for each cell?
outer(A, B, F)
116
In R, how do I print vector Y without printing missing values?
Y[!is.na(Y)]
117
not_cancelled %>% | -> count(dest)
A table showing count of not cancelled by dest
118
What 2 R packages are useful for larger, interactive heatmaps?
1) d3heatmap | 2) heatmaply
119
In an R plot, set x & y labels color and font?
col. lab = | font. lab
120
In R, make a scatter.plot matrix of the data in obj?
pairs(obj)
121
dplyr's measures of position: 1) x[1] 2) x[2] 3) x[length(x)]
1) first(x) 2) nth(x, 2) 3) last(x)
122
Geom for a tile plot?
geom_tile()
123
In R, calculate the mean of X without NAs?
mean(x, na.rm=T)
124
Rather than filtering out messy data, another--perhaps better--route?
Make the values missing
125
In R, find F value for alpha = 0.05 in the lower tail, where x follows f-dist and df1 = 5, df2 = 15?
qf(0.05, 5, 15)
126
In base R, x is a vector of age data. Create a histogram with an x-label, a title, and bins of size 20. Then add lines to the histogram?
hist(x, xlab="Age" main = "title", breaks = 20) | lines(density(x))
127
In RStudio, start a new script?
ctrl - shift - n
128
ggplot histogram?
geom_histogram()
129
Convert data frame to tibble?
as_tibble()
130
ggplot frequency polygon?
geom_freqpoly()
131
not_cancelled %>% | -> count(tailnum, wt=distance)
A table showing miles flown by each tailnum among not_cancelled
132
With dplyr, return dat's columns var1, var2, var3?
select(dat, num_range("var", 1:3))
133
A function useful for 5<=x<=10?
between(x, 5, 10)
134
In base R, change the color of the axes?
par(fg = )
135
In R, return items in both A and B?
intersect(A, B)
136
ggplot's bar graph?
geom_bar()
137
readr's parsing functions when the data are already read into R?
parse_*()
138
delays ? - > filter(n > 25) ? - > ggplot(aes(x=n, y=delay)) ? - >-> geom_point(alpha = 1/10)
1) %>% 2) %>% 3) +
139
Create tibble from individual vectors?
tibble()
140
In readr, read in txt.csv and identify the comment lines as starting with #?
read_csv("txt.csv", comment = "#")
141
in readr functions, don't read first 5 lines?
skip = 5
142
In R, return 50th percentile of x?
median(x)
143
What ggplot function is critical for horizontal bar chart?
coord_flip()
144
In dplyr, how do you remove grouping?
ungroup()
145
In RStudio, send previously sent chunk from editor?
ctrl - shift - p
146
What is geom_bin2d() and geom_hex()?
Divides coordinate plane into 2d bins and uses fill color to show density
147
ggplot(smaller, aes(carat, price)) + | -> geom_boxplot(aes(group = ???(carat, 0.1)))
cut_width
148
tidyverse package for querying databases?
DBI
149
tidyverse package to read in SPSS, Stata, or SAS files?
haven
150
Using geom_point(), add transparency?
alpha = ..
151
readr's identify encoding?
guess_encoding()
152
What are the main differences between data.frame and tibble?
1) printing | 2) subsetting
153
print(flights, 1? = 10, 2? = 3?) 1=argument for number of rows 2=argument for number of columns 3=argument for all columns
1) n 2) width 3) Inf
154
For a density plot (frequency polygon) in ggplot: ggplot(data = diamons, aes(x=price, y=1?)) + -> 2?(aes(color=cat), binwidth=500)
1) ..density.. | 2) geom_freqpoly
155
In R, return the mean of each column in matrx x?
colMeans(x)
156
In R, create a boxplot of age data in x?
boxplot(x)
157
In R, return the sum of columns in matrix m without colSums()?
apply(m, 2, sum)
158
In R, test if x is TRUE?
isTRUE(x)
159
In dplyr, return flights where month equals the last 6 months of the year?
filter(flights, month %in% 7:12)
160
dplyr command to pick variables by name?
select()
161
dplyr command to operate on data group-by-group?
group_by()
162
R shortcut for
alt + -
163
dplyr command to pick observations by value?
filter()
164
In R, view list of functions and data in package "spatial"?
library(help = spatial)
165
With dplyr, return flights where month equals 1 and day equals 1?
filter(flights, month == 1, day ==1)
166
dply command to create new variables with functions of existing variables?
mutate()
167
What happens to NA values using dplyr's filter()?
Filter excludes NA & False
168
With dplyr, return year, month, and day from flights?
select(flights, year, month, day)
169
With dplyr, return flight's column time_hour and then all other columns?
select(flights, time_hour, everything())
170
In R, what argument is used for removing borders?
bty = 'n' | border type...
171
In R, what are 2 ways to show overlapping dots in a scatterplot?
1) jitter(x) or jitter(y) or both | 2) sunflowerplot(x, y)
172
In R, what is the default graphics window size?
7 inches by 7 inches
173
In R, what does which(requests %in% stock) return?
The index of items in requests that match an item from stock
174
In R, set the line thickness?
lwd = # | Line width...
175
In R, what does this do? peas[1:length(peas) %% 2 ==0]
Returns objects in peas at even rows
176
In R, pmin(x, y, z)?
Returns the minimum of x, y, or z across each item in x, y, and z
177
In R, what is the difference between unique() and duplicated()
unique returns just the unique items while duplicated returns a boolean vector identifying duplicates
178
In R, what does this do? peas[-length(peas)]
Returns peas without its last item
179
In R, what functions are useful in naming rows or columns?
rownames() | colnames() or names()
180
In R, how do I generate 1, 1.5, 2, 2.5, 3?
seq(1, 3, 0.5)
181
In R, what function is helpful in making flat contingency tables?
ftable()
182
In R, DF is data for 2 factor variables with 2 levels each, count up the combinations for dat1 & dat2?
table(dat1, dat2)
183
In R, create a plot of x and y that looks like a line plot with no right border?
plot(x, y, type = "l", bty = "c")
184
In R, how you view complete list of available packages?
library()
185
With dplyr, by_day=group_by(flights, year, month, day): return the average daily dep_delay?
summarize(by_day, mean(dep_delay, na.rm=T)
186
In R, turn off x-, y-labels, and title?
ann = F | annotations...
187
In R, remove all user-defined variables?
rm(list=ls())
188
In R, how can you fine tune top, left, right and bottom axes?
``` axis() 3 _3_ 2 | | 4 |_1 _| ```
189
In R, counts
table(counts)
190
In R, add a grid to a plot?
tck=1 (default is 0) | tick marks...
191
In R, sub() vs. gsub()?
sub replaces 1st occurrence of a pattern; gsub replaces all occurrences of a pattern
192
In R, set background color for graphics?
par(bg="grey") background...
193
In R, if is.ts(dat)=true, then what is returned for plot(dat)?
a timeseries graph, which is actually plot.ts(dat)
194
In R, what function is superior to attach() due to environment issues?
with(data, function(...))
195
In R, na._(x) will return x w/o NAs?
na.omit
196
In R, sort DF by Var1, Var2, and then Var3?
DF[order(DF$Var1, DF$Var2, DF$Var3)]
197
In R's lattice, create a box and whisker plot of Growth vs. Water and Daphnia given detergent?
boxplot(Growth ~ Water + Daphnia | Detergent)
198
In R, standardize dat's columns 2:3?
scale(dat[,2:3)
199
In dplyr, create new variables and get rid of all others?
transmute()
200
read_csv("challenge.csv", 1? = 2?(x = col_double(), y = col_date()))
col_type = cols
201
Geom for a boxplot?
geom_boxplot()
202
In R, transfer data from Excel to R?
Copy from Excel and readClipboard()
203
With dplyr, return flights columns that end with "es"?
select(flights, ends_with("es"))
204
In R, return p-value when observed chi-square is 14.56 and df = 7?
1 - pchisq(14.56, 7) or | pchisq(14.56, 7, lower.tail = F)
205
In R, how do you prepare to make a 4 plots on the same output?
``` par(mfrow = c(2, 2)) (row) par(mfcol = c(2, 2)) (column) ```
206
In R, justify text w/i the text() function?
adj = c(x, y)
207
In R, x
DOTplot(x)
208
In R, when is the argument used to chane the plotting symbol color?
When pch = 21:25
209
In R, rows in DF whereVar1is greater than its median and Var2 is True?
DF[DF$Var1 > median(DF$Var1) & Var2 == T]
210
In R, x
x[-which(is.na(x))]
211
In R, N
x.bar = c() for (i in 1:N){ -> x x.bar[i] = mean(x) }
212
Whenever you group_by(), what should you include?
counts using n()
213
In R, if data are not normal and a t test is not possible, what is the appropriate test function?
wilcox.test()
214
Apply readr parsing heuristics to the character columns in data frame?
type_convert()
215
tidyverse package to read Excel?
readxl
216
read_csv("file.txt", ? = "#N/A"
na
217
In R, return the names of columns in a data frame?
names(table)
218
Call df$x using pipe?
df %>% .$x
219
In readr, parse_*() vs. col_*()
parse_*() when dealing with character vector
220
In R, x
x[x %% 4 == 0]
221
In R, name matrix m's rows A, B, C, D?
rownames(m)
222
dplyr command to collapse many values down to single summary?
summarize()
223
With dplyr, number of unique items?
n_distinct()
224
read_csv("file.csv", ? = F)
col_names
225
In R, rotate text 45 degrees in a plot?
arg srt = #
226
What are R's 6 types of atomic vectors?
``` logical = T, F integer = 1L, 2L, 3L double (numeric) = 2.5, 4.5 Character = "a", "1" complex & raw, which are both rare ```
227
In R, how do you view loaded libraries and environments?
search()
228
ggplot(data = mpg, aes(x = displ, y = hwy)) + -> geom_point(data = ?) Only include subcompacts from class variable?
filter(mpg, class = "subcompact")
229
In R, sum x when x is less than 5?
sum(x[x<5])
230
In R, how you save your existing history of commands to "fname"?
savehistory(file = "fname")
231
In R, cut(x, c(0, 2, 4, 6))?
Return a vector of length(x) that is a factor with (2, 4], etc., which is the same as 2 <= x < 4
232
In R, add an arrow from (1,1) to (3,8)?
arrows(1, 1, 3, 8)
233
In R, return months from dates? POSIXlt
dates$mon
234
In R's plot() or lines() function, what arguments sets line type?
lty
235
In R, return a current date/time?
Sys.time() or date()
236
In R's plot, what argument for setting the scale for y?
ylim = c(0, 100) (example...)
237
In R, return the day of the month for POSIXlt formatted dates?
dates$mday
238
In R, output DF as "table.txt" that includes the names of rows and columns?
write.table(DF, "table.txt", col.names = T, row.names = T)
239
In R, how do you find all objects that match "lm"?
apropos("lm")
240
In R, what function is useful for generating a pallete in grey scale?
grey()
241
In R, capitalize all characters in a string?
toupper()
242
In an R plot, how do you add dots from additional data?
points()
243
In R's hist(), set bin edges for count data with range 0:9 and width of 1?
breaks = (-0.5:9.5)
244
In R, what is current value parameter 'family'?
par('family')
245
In R, remove quotes around a string for printing?
noquote()
246
In R, take a bunch of DVs across columns and make it 1 long vector?
stack()
247
Reorder class based on hwy's median?: gglplot(mpg, aes(class, hwy)) + -> geom_boxplot()?
ggplot(mpg) + | -> geom_boxplot(aes(reorder(class, hwy, FUN = median), hwy))
248
In R, given dates of class POSIXlt, return seconds?
dates$sec
249
In R, how you create a plot's key?
legend()
250
In R, return dates day of the year?
dates$yday
251
In R, xv[which(abs(xv-108)==(min(abs(xv-108))]
Returns xv that is nearest 108
252
In R, iris[,5] is flower names. Return index of rows that contain names that include a "a"?
grep("a", iris[,5])
253
In R, what is 'not', 'and', and 'or' inside and outside an if operation?
``` not = ! and ! and = & and && or = | and || ```
254
In R, set axis notation color and font?
col.axis, font.axis
255
In R, return a vector of the position of a matched pattern in the text where it exists and a -1 otherwise?
regexpr()
256
In R, view all available datasets included in installed packages?
data(package = .package(all.available=TRUE))
257
In R, calculate the proportion of each item in a table based on the grand total?
prop.table(table) ---- (no margin....)
258
For R, what is a for script to print 1-5 one a time?
for (i in 1:5){ | ->print(i)}
259
In R, quickly return a set of common statistics for obj?
summary(obj) or fivenum(obj)
260
In R, return probability that x is <=4 based on a normal distribution where mean = 5 and sd = 0.125?
pnorm(4, mean=5, sd = 0.125)
261
In R, find probability that -1
pt(1.5, 29) - pt(-1, 29)
262
In R, return the sums of rows of M without using rowSums()?
apply(M, 1, sum)
263
In R, what function can add words to a graph based on x and y coordinates?
text()
264
Using R's RColorBrewer package and the set2 pallette, create an 8-color pallette?
brewer.pal(8, "Set2")
265
In R, turn a vector of positive and negative numbers into -1s, 0s, and 1?
sign()
266
In R, how do you restore a previously save R file called "Fname"
load(file = "Fname")
267
In R, what function is useful for printing a sentence as output?
paste()
268
In base R, read in "fname.txt", which is a file that has columns separated by whitespace & a header line?
dat
269
In R, if t
t2
270
In base R, return the positions of matched patterns in each string for all strings in S?
gregexpr(pattern, text)
271
In base R, what is a 1st step when doing date calculations?
Convert objects to POSIXlt
272
In R, tapply(temp, month, function(x) sqrt(var(x) / length(x)))?
Returns temp by month after function operation, which is the standard error.
273
In R, what happens to a vector of words in a data frame? How do you go back?
- coerced to factor | - as.character(factor)
274
In R, how do you get the modulo of 119/3 and how do you get the integer quotient?
1) 119 %% 3 = modulo (remainder) | 2) 119%/% 3 = integer
275
In R, how do you generate n random numbers from a uniform distribution between 0 + 1?
runif(n)
276
In R, closest integer to x between x + 0?
trunc(x) or floor(x)
277
In R, how do you see an example for the "lm" function?
example(lm)
278
In R, return the length of vector x?
length(x)
279
In R, anti log of x?
exp(x)
280
In R, see help pages for sum() function?
?sum
281
In R, return vector of ranks of values in x?
rank(x)
282
In R, sample variance of vector x?
var(x)
283
In R, how do you combine vector x with vector y?
c(x, y)
284
In R, vector of the product of all values of x up to that point?
cumprod(x)
285
In R, dat
tapply(dat$height, list(dat$gender, dat$race), mean)
286
Describe match(x, y)?
Returns y's index numbers for each item of x that is in y
287
With dplyr, return flights columns that have a title w/ a repeated character back to back?
select(flights, match("(.)\\1")
288
Five ways to subset a tibble?
1) .$name-vector 2) .[['name]]-vector 3) .[[position]]-vector 4) .['name']-tibble 5) .[position]-tibble
289
In R, set the size of the margin around the plot based on lines of text?
par(mar = c(bottom, left, top, right))
290
In R, sum each column of matrix x?
colSums(x)
291
In R, how do you remove the variable x?
rm(x)
292
In R, x
x >= 5
293
In R, how can you enter values one at a time from input?
scan()
294
In R, how do you view existing variables?
ls() or objects()
295
In R, how do you see a list of built in datasets?
data()
296
In R, smallest integer > x?
ceiling(x)
297
In R, how do I return all of dat's columns from row 4 or all of dat's rows fromcolumn 10?
dat[4,] | dat[,10]
298
In R, return vector of the cumulative sum of x?
cumsum(x)
299
In R, round x to nearest integer?
round(x, digits = 0)
300
In R, assign dat to a file I choose, which is a csv with headers?
dat
301
In R, return the name of the day of the week for dates?
weekdays(dates)
302
In R, x
stem(x)
303
In R, how do you create a function that returns multiple variables?
Use return() with a list containing the variables to be returned
304
In R, prepare to plot 16 graphs, 2 in each row?
par(mfrow = c(8, 2))
305
In R, return a sequence of dates between 10/1/1997 and 10/1/1997, with a date every 3 months?
seq(as.POSIXlt("1997-10-01"), as.POSIXlt("2007-10-01"), | ->"3 months")
306
In R, make a bar graph of the categorical data day with a label "A" on x-axis, a title "Title", and "B" on y-axis?
barplot(day, xlab = "A", ylab = "B", main = "Title")
307
In R, right justify text in a graph and then left justify it?
``` par(adj = 1) par(adj = 0) ```
308
In R, correlation of vector x and y?
cor(x, y)
309
In R, return min of vector X up to each point in vector?
cummin(x)
310
In R, x
which(x<3)
311
In R, return vector from 5 to 25 that increases by 0.25?
seq(5, 25, 0.25)
312
In R, x
x[x<=50]
313
In R, how do you force it to make you push enter subsequent graphs?
par(ask = TRUE)
314
In R, return sorted version of x?
sort(x)
315
In R, what function provides info about ow to cite R software?
citation()
316
In an R plot, how do you add stepped lines that connect points?
lines(x, y, type = "s") S for up then over s for over then up
317
In R, return any item in A or B?
union(A, B)
318
In R, return the positions in a vector of a matched pattern?
grep()
319
In R, what are four useful functions for rounding?
round() ceiling() floor() trunc()
320
In R, which.max(x)?
Returns index of the maximum value of x.
321
In R, vector's 3 common properties?
Type - typeof() Length - length() Attributes - attributes()
322
In R, how do you install and load a package?
install.packages("package") | library(package)
323
In R, is any value greater than 0 in X? Are all values greater than 0 in X?
any(X>0) all(X>0)
324
With R's RColorBrewer package, create a 12-color pallette with the "Spectral" colors?
brewer.pal(12, "Spectral")
325
In R's plot(), do not include any axis?
axes = FALSE
326
In R, read in a csv file saved as "fname.csv"?
read.csv("fname.csv")
327
In R, suppress the creation of the y-axis?
yaxt = "n"
328
In R, given dates of class POSIXct, return the minutes object?
as.POSIXlt(dates)$min
329
In R, view the components of a list?
unlist(list)
330
In R, sort dataframe DF by the variables CARS?
DF[order(DF$CARS,]
331
In R's plot(), argument for the label on the x-axis?
xlab = "label"
332
In R: | union(A, B) vs intersect(A, B) vs setdiff(A,B)
union provides all items from A and B intersect provides items that are in A and B setdiff returns items in A that are not in B
333
In R, prepare to overlay existing plot with another plot?
par(new = TRUE)
334
In R, test whether 2 items/sets are equal?
setequal(a, b)
335
In R, set the font to serif for plotted text?
par(family = 'seriff')
336
In base R, return items that match a pattern?
grep(pattern, vector, value = T)
337
In ggplot, add labels?
labs()
338
In R, create a vector of A, B, C that each repeat 4 times?
gl(3, 4, labels = LETTERS[1:3])
339
In R, what is coplot()?
coplot(y~x|z) returns multiple scatter plots y vs x at various ranges of z.
340
In R plot, set title color and font?
col. main = | font. main =
341
In R's plot what function is useful for drawing the area under the curve?
polygon()
342
In R's plot, plot labels L using X and Y, centered on X, placed half a character below original points?
text(X, Y, labels - L, pos = 1, offset =0.5) position refers to first position, X
343
In base R, how do you join strings?
paste()
344
In R's lattice, draw a histogram for minTemp given month?
histogram(~minTemp | month) month must be a factor...
345
In R, how can you point and click on the location you want a legend?
locator(1) as position argument
346
In R's graphs change the box line color?
fg =
347
In R, what is the probability of A given B within sample space S?
(length(intersect(A,B)) / length(S)) / | length(B) /length(S)
348
In R, what function is useful for creating file paths?
file.path()
349
In ggplot, describe the 7 parameters for making any plot using a generic example?
ggplot(data = DATA) + GEOM_FUNCTION(aes(MAP), stat = STAT, position = POSITION)+ COORDINATE_FUCTION + FACET FUNCTION data, geom, map, statistic, position, coordinate, facet
350
In R, return the hour object for right now?
as.POSIXlt(Sys.time())$hour
351
In R, sort dataframe DF by Var1 in reverse order?
DF[rev(order(DF$Var1)),]
352
In base R, what function is useful for counting letters in a string?
nchar()
353
In R plot, set subtitle color and font?
col. sub = | font. sub =
354
In R, return dates day of week?
dates$wday
355
In R: what function is for applying functions to rows/columns of a matrices of dataframes?
apply()
356
In R: what function is for applying functions to vectors?
sapply()
357
In R: what function is for applying functions to lists?
lapply()
358
In R: what function is for applying functions to a DF?
tapply()
359
In R's plot(), set plot char size?
argument cex =
360
In base R, extract from STRING the characters from M to N?
substr(M, N, STRING)
361
In R, dates
strptime(dates, "%d/%m/%Y")
362
In R, reverse sort DF by factor Var1 and normal sort if by Var2?
DF[order(-rank(DF$Var1), DF$Var2)]
363
With tidyr, merge table5's century and year columns to make new_year column?
unite(table4, new_year, century, year)
364
In base R, join Dat1's Var1 and Var2 with Dat2's Name1 and Name2, including incomplete cases?
merge(Dat1, Dat2, by.x = c("Var1", "Var2"), by.y =c("Name1", "Name2"))
365
tidyr::unite's default sep?
_
366
tidyr function to replace missing values with last observation?
fill()
367
Make tidy with tidyr: ``` TABLEA Country->type->count x -> cases -> # y -> cases -> # z -> cases -> # x -> pop -> # y -> pop -> # z -> pop-> # ```
TABLEA %>% spread(key = type, value = count)
368
With tidyr, 2 ways to set separates's sep parameter?
1: regular expression 2: position (positive # = far left, neg # = far right)
369
With tidyr, combine multiple columns into a single column?
unite()
370
Default sep in tidyr's separate function?
any non-alphanumeric character
371
tidyr verb to deal with observations scattered across rows
spread()
372
In R, how do you adjust the plotting region?
plt=c(BOTTOM, LEFT, TOP, RIGHT)
373
In R, how do you set the fill color for boxplots, histograms, etc?
col =
374
What tidyr verb to turn a variable spread across columns into a single column?
gather()
375
With tidyr, split a 'rate' column (from dat), x/y, into 2 columns?
separate(dat, rate, into = c("x", "y"))
376
tidyr's function for making implicit missing values explicit?
complete()
377
Stocks has year, quarter, and return, use tidyr to check for missing values?
stocks %>% complete(year, quarter)
378
With tidyr, separate, pull, gather, and spread functions, re-evaluate column types?
convert=TRUE)
379
In R's plot(), set orientation of #s on tick marks?
the argument las =
380
x
str_sub(x, 1, 3)
381
3 ways to use 'by' in join operations from dplyr?
1) default, by = null, uses all variables that appear in both tables 2) character vector, by = 'varname', uses the variable name specified from both tables 3) 2 character vectors, by = c('a' = 'b'), use "a" from X and "b" from Y
382
astr
"ater"
383
In R, return the proportion of items in each group organized and computed by column, using the matrix dat?
prop.table(dat, 2)
384
In R, how do you load the environment history saved in "fname"?
loadhistory(file = "fname")
385
dplyr version of: merge(x, y, all.x = TRUE)
left_join(x, y)
386
In R, create Y that is a sorted version x?
Y
387
In R, how do you save existing environment objects to "fname"?
save.image(file='fname')
388
In R, view history?
history(Inf)
389
With stringr, return the start and end of first match in x?
str_locate()
390
What is the explicit way to str_view(fruit, 'nana')?
str_view(fruit, regex('nan'))
391
With stringr, return boolean vector for string matches?
str_detect()
392
Two stringr functions to test regexp?
str_view() | str_view_all()
393
With str_extract_all(), return a matrix result?
simplify = TRUE
394
Using dplyr and stringr: return df$words where words is equal to "x$"
df %>% filter(str_detect(words, "x$"))
395
In R, given FACTOR and Y, create a plot that shows the value of Y for each case with FACTOR?
stripchart(Y~FACTOR)
396
Merge with dplyr: flights %>% -> _____(airlines, by = 'corner')
left_join
397
str_sub("Apple", -3, -1)
plt
398
In R, how do you save the object X to "fname"?
save(x, file = "fname")
399
To speed up stringr functions for simple searches, what do you replace regex() with?
fixed()
400
What are dplyr's filtering joins?
semi_join(x, y): keeps all in x that have match in y | anti_join(x, y): drops all in x that have match in y
401
str_c("p", c('a', 'b', 'c'), 's')?
'pas' 'pbs' 'pcs'
402
x %>% full_join(y)?
Keeps all observations of x and y
403
In R v
matrix(R, nrow=3, byrow=TRUE)
404
stringr's version of nchar()?
str_length()
405
with stringr, combine strings with no space?
str_c() | The default sep = ""
406
x %>% right_join(y)?
Keep all of y's observations
407
With stringr, what are possible with boundary()?
character line sentence word
408
With stringr, identify the number of string matches?
str_count()
409
dplyr version of: merge(x, y, all.y=TRUE)
right_join(x, y)
410
x %>% left_join(y)?
return all observations of x
411
ggplot(dat, aes(x)) + geom_bar() forcats: Add a line to prevent dropping levels of x that have no values.
... + scale_x_discrete(drop = FALSE)
412
ggplot(relig, aes(tvhours, relig)) + geom_point() forcats: Rewrite this to put relig in order of tvhours?
ggplot(relig, aes(tvhours, fct_reorder(relig, tvhours)))+ | -> geom_point
413
Using forcats, reorder FACTOR so that "Not Applicable" is the first category?
fct_releve(FACTOR, "Not Applicable")
414
forcats function to make legend colors match order of plotted objects?
fct_reorder2()
415
gss_cat %>% - > mutate(marital = marital ...??? - > ggplot(aes(martial) + - > geom_bar() Add forcats functions to get marital in order of increasing frequency on the plot
gss_cat %>% - > mutate(marital = marital %>% fct_infreq() %>% fct_rev()) %>% - > ggplot(aes(martial) + - > geom_bar()
416
forcats function to adjust a factor's levels?
fct_recode()
417
forcats function to adjust a factor's levels, while reducing the number of levels as well because you can pass a vector of levels for each new level?
fct_collapse()
418
forcats function to aggregate smaller factor levels into an "Other" category?
fct_lump()
419
lubridate function to get current date?
today()
420
lubridate function to get current date-time?
now()
421
lubridate function to create date from "2011-01-15"
ymd()
422
lubridate function to create date from "Jun 15 2011"
mdy()
423
lubridate function to create date from "15 April 2009"
dmy()
424
lubridate function to create date-time from "2011-01-15 20:11:19"
ymd_hms()
425
lubridate function to create date from month, day, and year spread across columns?
make_date(year, month, day)
426
lubridate function to create date-time from month, day, year, hour, min, second spread across columns?
make_datetime(year, month, day, hour, minute, second)
427
lubridate function to convert date to datetime?
as_datetime()
428
lubridate function to convert datetime to date?
as_date()
429
lubridate function to extract year from dt
year(dt)
430
lubridate function to extract the full month name from dt
month(dt, label = T, abbr = F)
431
lubridate function to extract the day of the month from dt
mday(dt)
432
lubridate function to extract day of the year from dt
yday(dt)
433
lubridate function to extract full day of the week name from dt
wday(dt, label = T, abbr = F)
434
lubridate function to extract hour from dt
hour(dt)
435
lubridate function to extract minute from dt
minute(dt)
436
lubridate function to extract second from dt
second(dt)
437
lubridate function to round dt
floor_date(dt, "week")
438
lubridate function to round dt
ceiling_date(dt, "month")
439
dt
year(dt)
440
dt
update(dt, year = 2010, mday = 19)
441
my_age
as.duration(my_age)
442
my_age
my_age + dyears(2) + dweeks(7) + ddays(3)
443
my_age
my_age + years(2) + weeks(7) + days(3)
444
What is the difference between lubridate's durations and periods?
durations use seconds and are exact, but can do unexpected things around day light savings time periods work with "human" times and aren't exact, but can do what you would expect around day light savings time (for example)