R coding Flashcards

(97 cards)

1
Q

Whats mtcars

A

it is a data set set into R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Is(mtcars)

A

lists the column names in mtcars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does $ mean in mtcars$mpg?

A

$ accesses lists of models within the dataframe/list here its mpg within mtcars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

mean()

A

finds the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

median()

A

finds the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does sort(mtcars$mpg, decreasing = TRUE)

A

Sorts the Mpg values within the mtcars dataset from highest to lowest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does sort(mtcars$mpg, decreasing = FALSE) mean ?

A

it will sort the dataset from lowest to highest. since the default is False. you can write sort(mtcars$mpg) instead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

range ()

A

displays the lowest and highest value in the dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do i create a dataset and run it?

A

goals <- c (0 0,2,3,0,1)
goals

output:
[1] 0 0 2 3 0 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

sort()

A

ascending order displayed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

rev()

A

displays in reversed order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

table(x)

A

shows frequency in x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

table(x,y)

A

shows frequency in x, y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

max()

A

shows max value in dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

min()

A

shows min value in dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

sum()

A

shows total value in dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

give me a manual way of calculating mean

A

( 2 +3 +4 + 5 + 6) / 5 = 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

diff(range())

A

shows the difference in range from max - min

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

how to create a function?

A

name <- function (x) {

}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

how can i run the name function?

A

name(10)
name(dataset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

hist(dataset)

A

displays the dataset by using a histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

inflation_clean$period

A

shows the entire period column from the inflation_clean dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

inflation_clean$period[5]

A

shows the 5th row from the period column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

inflation_clean$mpg[1:33]

A

shows from the FIRST row from 1-33 the data is being displayed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
plot()
command to make a graph
26
plot(inflation_clean$period[1:33], inflation_clean$inflation[1:33]
plot(inflation_clean$period[1:33] code for on the X axis, eg: years from 1989-2021 inflation_clean$inflation[1:33] code for the Y axis, eg, inflation numbers for each year. The graph shows how inflation changed every year from 1989 to 2021
27
how to cacluate compound interest manually? With the example: you start with £1000 and earn 3% yearly
After 1 year : 1000 * 1.03 = 1030 After 2 years: 1000 * 1.03^2 = 1060.9
28
How to calculate compound interest using variables?
interest_rate <- 0.03 deposit <- 1000 years <- 10 savings <- deposit * (1 + interest_rate)^ years savings output: 1343.916
29
how to calculate compound interest by making a function?
savings_calculator <- function (interest_rate, deposit, years) { deposit * (1+ interest_rate)^ years } savings _calculator (0.02,500,300) output: [1] 688704.1
30
using the savings_Calculator function how can you use it to solve the following: with 1.5% interest and £750 depoisted, how many years until savings exceed £2000 ?
you can use trial and error with the following: savings_calculator (0.015,750, 65) savings_Calculator (0.015, 750, 66) keep trying till you find a vlaue less than <£2000
31
What code can i use to calculate savings from every year from 1 to 100?
1) bunch_of_years <- c(1:100) bunch_of_years output: 1,2,3,...100 2) savings_calculator(0.015,750, bunch_of_years) output: ...1974.03, 2003.64, 2033.74......
32
whats the use of ? command?
openis documentation in R studio Help pane
33
view(mtcars)
opens spreadsheet in like window for mtcars in R Studio
34
how to calculate quantile Q1 and Q3
quantile(mtcars$mpg, probs = c(0.25 , 0.75)) output : 25% 75% 15.425 22.800
35
how to calculate the IQR (Q3-Q1)
IQR(mtcars$mpg) output: [1] 7.37%
36
var()
calculates variance foe the dataset
37
sd()
calculates standard deviation
38
what are the measures of spread?
variation and standard deviation
39
what are the measures of central tendency?
mean and median
40
what does measures of spread show?
how consistent or variable the data is
41
what does the measures of central tendency show?
you understand where most of the data lies
42
summary() display?
min, max, median, Q3,Q1, max Show a quick statistical summary of an object, typically a dataframe,vector or model
43
names()
shows the columns of the dataset
44
head()
shows default 6 rows/ first few rows of the dataset
45
nrows()
shows number of rows in the dataset
46
ncol()
shows the number of columns in the dataset
47
dim()
shows the number of rows and columns in the dataset
48
the coin can land (1) for heads or (0) for tails. what does sample(c(0,1), 10, replace = TRUE) mean?
pick randomly 0,1 this is repeated 10 times, since its with replacement you can pick the same number again. output: [1] 1,0,0,0,0,0,0,0,1,0
49
if you want to use sum(sample(c(0,1), 10, replace = TRUE))
shows total of the number of heads output: 5
50
What's the code for without replacement?
sample(c(0,1), 10, replace = FALSE)
51
if you want to make more than 10 flips and want to make a simplier code what would it be?
thousand_samples_of_10 <- replicate(1000,sum(sample(c(0,1), 10, replace = TRUE)/10)
52
what does barplot(table(thousand_samples_of_10)) show?
generates a bar plot of number of heads in 10 coin flips repeated 1,000 times. shows fimilar bell shape that can also show approx of normal distribution
53
how can u edit the labels of x,y and the title into the barplot?
barplot(table(thousand_samples_of_10), main = 'distribution of the fraction of 'Heads' IN 1,000 sets of 10 coin flips' xlab = 'Fraction of 'Heads' ylab = 'Number of outcomes')
54
how can i compare bar plots that shows the law of large numbers aswell?
1) Collect the sample size: thousand_samples_of_500 <- replicate (1000,sum(sample(c(0,1), 500,replace = TRUE))/500) 2) Make a bar plot barplot(table(thousand_samples_of_500)..) each bar plot will display the fraction of heads from each set of samples. As the sample size increases, the distribution becomes tighter around 0.5, illustrating the Law of Large Numbers
55
whats the purpose of a Loop?
- You need to do the same action multiple times - want to avoid repeating code manually - working with lists,datasets or sequences
56
give an example of a loop
for (i in c(1, 2, 3)) { print(i^2) print("Long Live the King") } Output: [1] 1 [1] "Long Live the King" [1] 4 [1] "Long Live the King" [1] 9 [1] "Long Live the King"
57
what are the three functions under normal distribution?
rnorm() dnorm() pnorm() qnorm()
58
What does rnorm(n, mean = 0, sd = 1) mean?
n = number of values to generate mean = average of distribution (default 0) sd = standard deviation (default is 1) generates random numbers from a normal disribution
59
what is the output of rnorm(5,10,2)?
[1] 9.56979 8.268291 10.578227 11.402636 9.266921 output: Returns 5 random values from a normal distribution with mean 10 and SD 2
60
What does dnorm(x, mean = 0, sd = 1) mean?
Gives the density (height of the curve) of the normal distribution at a specific value x Returns the height of the standard normal curve at X = 0
61
Whats the output of dnorm (4, -3, 7) ?
height of normal curve x = 4 mean = -3 standard dev = 7 aim is to find the density output: [1] 0.03456745
62
what does pnorm(q, mean = 0, sd = 1) mean?
gives the cumulative probability up to certain value of q
63
whats the output of pnorm(7, 0.5, 9)?
mean = 0.9 standard dev = 9 P(x≤ 7) Output: [1] 0.7580363 approximately a 75.8% chance that a randomly selected value from this distribution is less than or equal to 7
64
what does qnorm(p, mean = 0, sd = 1) mean?
aim is to return the z-score for the given quantile for the given cumulative probability p
65
what is the output for qnorm(0.75,-5,2) ?
gives value which 75% of the data would fall in normal distribution with: mean = -5 standard dev = 2 output: -3.65102 this means: In a normal distribution with mean -5 and sd 2, about 75% of values are less than -3.65
66
whats the purpose of whats pnorm(1.2,0,1) - pnorm(-0.5,0,1) and the output?
purpose: probability that a value from a standard normal distribution (mean = 0, sd = 1) falls between -0.5 and 1.2 output: 0.8849 - 0.3085 = 0.5764 [1] 0.5764 About 57.64% of the values in a standard normal distribution lie between -0.5 and 1.2.
67
what are the four functions that deal with binominal distribution?
aim: models the number of successes in a fixed number of independent yes/no (Bernoulli) trials, like flipping a coin rbinom() pbinom() dpinom() qbinom()
68
what does rbinom(n, size, prob) mean?
n: number of values to generate size: number of trials per sample prob: probability of success in each trial aim: generates random numbers from a binomal distribution
69
what rbinom(3,5,0.2) output?
Generates 3 random values. Each from 5 trials with 20% chance of success per trial output: 0 1 2 shows: Each number is how many "successes" occurred in a set of 5 trials (so values will be between 0 and 5)
70
what does what dbinom(x, size, prob) mean?
Gives the probability of exactly x successes in size trials
71
whats dbinom(3,20,0.5) output?
Calculates the probability of exactly 3 successes in 20 trials with a 50% chance of success output: 0.001294613 shows: So there's about a 0.13% chance of getting exactly 3 successes out of 20 with a fair coin
72
what does pbinom(q, size, prob) mean?
Gives the cumulative probability of getting q or fewer successes
73
whats pbinom(7,8,0.9) output?
Cumulative probability of getting 7 or fewer successes in 8 trials with 90% chance of success output: 0.7438306 shows: So there's about a 74.38% chance of getting 7 or fewer successes
74
what does qbinom(p, size, prob) mean?
Gives the number of successes corresponding to a cumulative probability p
75
whats qbinom(0.6,5,0.1) output?
What is the smallest number of successes such that the cumulative probability is at least 60%? output: 1 shows: So 1 success or fewer happens in at least 60% of cases
76
how to make a histogram out of binominal distribution?
binomial_100 <- rbinom(1000,20,0.8) hist(binominal_100)
77
what are the four functions used in Chi-Squared distribution?
dchisq(x ,df) pchisq(q, df) qchisq(p, df) rchisq(n , df)
78
what does dchisq(x ,df) mean?
Gives the density (height of the curve) of the chi-squared distribution at value x. x: value to evaluate df: degrees of freedom
79
what does pchisq(q, df) mean?
Gives the cumulative probability that a chi-squared variable is less than or equal to q. q: quantile df: degrees of freedom
80
what does qchisq(p, df) mean?
Returns the value of χ² for which the cumulative probability is p. p: probability df: degrees of freedom
81
what does rchisq(n, df) mean?
Generates random values from a chi-squared distribution. n: number of values df: degrees of freedom
82
what is a data frame?
2D structure Rows = observations Columns = Variables commonly used to organise data in R
83
what does class(object) mean?
Returns the class/type of an object (e.g., "data.frame", "numeric", "character")
84
what does data.frame() mean?
Creates a data frame, which is like a table or spreadsheet (rows and columns)
85
what does cbind() mean?
Combines vectors (or columns) side by side into a matrix or data frame
86
what does colnames(df) mean?
Returns or sets the column names of a data frame
87
what does getwd() mean?
Returns the current working directory, which is the folder R is reading/writing files from.
88
what does read.csv() mean?
Reads a CSV file into a data frame
89
what does View(df) mean?
Opens a spreadsheet-like viewer for the data frame in RStudio
90
what does na.omit(data_frame) mean?
Removes any rows with missing (NA) values from a data frame
91
what does glm(survived ~ predictors, family = binomial) mean?
Fits a logistic regression model using the Generalized Linear Model (GLM) function. survived ~ predictors: formula (dependent ~ independent variables) family = binomial: specifies logistic regression
92
what does summary(model) mean?
Gives a detailed summary of the model: Coefficients Significance (p-values) Residuals AIC
93
what does predict(model, type = "response") mean?
Gives predicted probabilities from the logistic model. type = "response": ensures output is in probability form (0–1)
94
what does cbind(data_frame, predictions) mean?
Combines the predictions with your original data frame, column by column
95
what does library(ggplot2) mean?
Loads the ggplot2 package for data visualization.
96
what does ggplot(data, aes(x, y)) + geom_point() mean?
Creates a scatter plot with: x: variable on x-axis y: variable on y-axis
97
what does + geom_smooth() mean?
Adds a trend line (usually a smoothed line using LOESS or linear model) to the plot