R Flashcards

1
Q

mode()

A

Identifies the type of variable in the brackets
i.e. is is a character or numeric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a vector?

A

A list of numbers or letters or character strings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

c()

A

combines the string of data into a vector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

vectorname[n]

A

gives the nth value in the named vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

vectorname[n1 : n2]

A

Gives the n1th to n2th values in the vector, inclusive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

vectorname[-n]

A

Gives the whole of the named vector, without the nth value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

sum()

A

Gives sum of vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

mean()

A

Gives mean of vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

max()

A

Gives largest value of vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

median()

A

Gives median of vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

var()

A

Gives variance of vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

sd()

A

Gives standard deviation of the vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

name = function(x) actualfunction(variable)

A

Defines the name as a function which is another function applied to a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

sqrt()

A

Square roots value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

mad()

A

Gives the median absolute deviation (comparable to the standard deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

shapiro.test()

A

Applies the shapiro test to the data, comparing it to a normal distribution.
Lower p value than 0.05 lets us reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

IQR()

A

Gives the interquartile range of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

summary()

A

Gives the minimum, maximum, median and mean as well as the 1st and 3rd quartiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

barplot()

A

Gives a bar graph of a vector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

table()

A

Gives a frequency table of a vector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

length()

A

Gives the length of a vector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

labels = as.vector(c(list of the names for the bars)
barplot((data in graph), names.arg = labels, xlab = name of x axis, ylab = name of y axis)

A

Labels the bars in the graph after the items in the labels vector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

hist()

A

Gives histogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

hist(GC, breaks = 50)

A

Allows us to chose how many bars in our graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
hist(GC, breaks = 50, col='green', xlab="GC content", ylab = "absolute Frequency", main = "main title”, cex.main=2)
Gives hitogram with given number of bars and labels. cex.main is the size of the title text.
26
dataset = read.table("filename.txt", header = TRUE)
Save data in dataset. h=F is a viable alternative.
27
attach(dataset)
Attaches all of the variables within the dataset.
28
stem()
Gives a stem and leaf plot of data
29
plot(a, b)
Gives a scatter graph comparing the two data sets.
30
plot(GC, reptime, main="Title", xlab="GC content", ylab="Replication time", pch=20, col="red")
Gives a scattergraph where pch controls the shape of the dots, 20 is circles
31
data1 <- read.csv("plant_data.csv", header = TRUE)
Reads data from csv files
32
boxplot(data1$height~data1$temp)
Gives a box and whisker plot.
33
boxplot(data1$height~data1$temp, xlab=expression("Temperature, "^o* "C"), ylab= "Height, cm", col = "lightseagreen", notch = T, las = 1)
Adds a o to the x axis label. The notches means that there is a 95% confidence interval on the interval. The las = 1 rotates the numbers on the y axis notches so that they are vertical.
34
binom.test(no. successes, no.attempts)
Defines a binomal with probability of success, bounds of confidence and hypotheses.
35
binom.test(no. successes, no.attempts, p of Ho)
Refines binomial expression by adding the expected p, which provides a null hypothesis.
36
#
Gives comments
37
sample(c("heads", "tails"), 1)
Gives one of the values in the array
38
sample(c("heads", "tails"), 10, replace = TRUE)
Allows the “coin” to be flipped multiple times without using up the values in the array.
39
sum(flips=="heads")
Sums values within flips that are the same as the provided string
40
head_count = function(k){ flips = sample(c("heads", "tails"), k, replace = TRUE) sum(flips=="heads") }
defines function which takes a value k.
41
{}
Allows for code across multiple lines
42
heads = replicate(100, head_count(10))
Replicates the given function that number of times and collects the data.
43
chisq.test(c(55, 45))
Runs chi squared test when one result is achieved 55 times and the other is achieved 45 times
44
chisq.test(c(120, 480), p=c(1/6, 5/6))
Chi squared test where we provide probabilities for each outcome. As many outcomes can be listed as you like.
45
chi = chisq.test(cdata)
Saves the chi squared test of data as its own variable.
46
chi$expected
Gives the expected distribution of data.
47
chi$observed
Gives the actual dustribution of data
48
sum(((chi$observed - chi$expected)^2)/chi$expected)
Equation for chi squared.
49
variable = scan()
Can fill a variable by writing a value then each value followed by a return and two returns to end it.
50
t.test(iq, mu=100, alternative="g")
Does the t-test where mu is the average and g is an alternative hypothesis of the mean of the sample is greater than it should be and l would be lower.
51
var.test(height$female, height$male)
Compares the variances of two groups of data to see if a t-test can be used.
52
datafile = "http://personality-project.org/r/datasets/R.appendix1.text" data.ex1 = read.table(datafile, header = TRUE)
Reads in data from online source, saves it to data.ex1
53
aov(Alertness~Dosage)
Runs ANOVA on the data called. The data before the ~ is the dependednt variable, and the one after is the independent.
54
anova1 = aov(Alertness~Dosage) summary(anova1)
Creates a summary of anova data. p value is given in Pr(>F) if below 0.05 we can reject there being no difference between the groups.
55
TukeyHSD(anova1)
Runs a Tukey test on anova'd data. Shows p values of comparisons of datasets
56
plot(TukeyHSD(anova1))
Plots graph of differences of means in data for the different groups compared.
57
cor.test(Relaxed, Hyperventilated)
Runs Pearson's correlation, gives correlation coefficient from -1 to +1
58
cor.test(xaxis, yaxis, method="spearm")
Runs correlation tests as spearman's rho.
59
cricketmodel = lm(freq~temp)
Get linear regression in terms of temp.
60
(cor(freq, temp))^2
Gives the multiple r-squared value - squared correlation coefficient.
61
abline(cricketmodel)
Adds line of best fit to plot.
62
count2 = na.omit(count)
Counts the number of N/As in the data.
63
plot(log(count2$Area), log(count2$Population))
Plots the logs of data
64
plot(log(count2$Area), log(count2$Population), xlab = expression ('ln (Area, km^2)'), ylab = "ln(Population)", col = "red", las = 1)
plots scatter graph with labeled axis in red.
65
kruskal.test(list(leach, stimpson)
Runs Kruskal's test on data. p of less than 0.05 means we can reject null hypothesis that all samples are drawn from same population. Can use more variables.
66
kruskal.test(allpay, bank)
Runs Kruskal's test on data, can use comma or ~.
67
library(dunn.test)
downloads the function dunn.test from the library
68
dunn.test(allpay, bank, kw = TRUE, method = "boneferroni")
compares allpay to bank using the bonferroni method. kw = true meansn that kruskal-wallis is used as well.
69
help(p.adjust)
calls up the R notes associated with that function.
70
unstacked.reptime = unstack(dataset[,c(4, 1)])
Unstackes data from columns and adds it to a new variable.
71
wilcox.test(first, second, paired = TRUE, exact = FALSE)
Runs paired wilcoxon test. exact = false used when there is a lot of data, so exact p value cannoot be calculated.
72
all = c(first, second)
Combines two vectors into one longer vector.
73
friedman.test(all.leaks, allsuits, pilots)
runs friedman rank sum test. Here finds whether there is a difference between the leakage of at least one suit compared to another.
74
friedman.test(all.leaks ~ allsuits | pilots)
Runs friedman test where: pilots is the group allsuits is the block
75
pairwise.wilcox.test(all.leaks, allsuits, p.adjust.method = "bonferroni", paired = TRUE, exact = TRUE)
Test to do after a significant friedman result.