test 1 Flashcards

1
Q

== means

A

“are the 2 things equal to each other?” or “is equal to”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

!= means

A

is not equal to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

> = means

A

is greater than or equal to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

<= means

A

is less than or equal to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

> means

A

is greater than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

< means

A

is less than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

logical operators in R- what means true and what means false?

A

true = 1
false = 0

(what the computer tells you when you ask it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the ways I can ask this question and what type of responses would I get?

are any of the x greater than or equal to 40 AND less than 60?

A

x >= 40 & x <60 - you would get a string of statements that say “true” or “false” for each element in the vector

sum ( x >= 40 & x < 60) - adding the sum command would count how many of them are true

which ( x >= 40 & x < 60) - tells you which element is true to the statement don’t use much but still useful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what does $ do?

A
  • grabs out a column from what you defined
  • can use it to add a new column too
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how to add another column to a data frame?

A

df.name$column <- c(…)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

if you have already defined “seedlings,” how can you extract the number of times the count “0” was observed?

A

sum(seedlings == 0)

sum gives the count, otherwise would just give the elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

if you defined seedlings, and now you want to see how many times each count occurred, what would you do?

A

df.seedlings <- data.frame (seedlings = c(0,1,2,3,4,5), freq = c(sum(seedlings==0), sum(seedlings ==1), sum(seedlings==2), sum(seedlings==3), sum(seedlings==4), sum(seedlings==5)))

first create data frame and then create 2 columns. one for the seedlings and one for the frequency. and then use the sum command to count how many seedlings were equal to 1, 2, 3, 4, and 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

you’ve defined x, now you want to get the 3rd and 4th elements of x, how?

all but the 3rd and 4th elements?

A

x[c(3,4)]

have to use both the square brackets and the vector ones

x[-c(3,4)]

put the negative outside of the c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

defined x, how to get only even numbers out?

to get the elements in reverse?

A

x[seq(2, length(x), by=2)]

x[10:1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

if x is defined, how to get the sum of the first and last element if there are 10 elements? the product of second and ninth?

A

x[1] + x[10]

x[2] * x[9]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to get square of each element in x?

A

x ^ 2

17
Q

what does the sample command do and how to use it?

A

generates a random sample of numbers, used like this:

sample(1:100, 25, replace=FALSE)

1:100 is the range of which you want the numbers to be from

25 is the amount of numbers you want

and replace is always FALSE

18
Q

what does set.seed() do?

A

makes the sample the same for sample1 and sample2

19
Q

with the cdc file we used, how would you find how many of the subjects are male?

A

sum(gender==”m”)

20
Q

list the 2 ways to find the median of the ages of nonsmokers vs smokers

A

many ways to find the median using the median command, here are 2:

  • median(cdc[smoke100==1,]$age)
  • smokers <- subset(cdc, smoke100=1)
    & smokers_median <- median (smokers$age)
21
Q

what does the ~ line mean in box plots?

A

used to specify relationships between variables in various functions

like:
boxplot(age ~ smoke100, data = cdc)

tells R to show the distribution of ages for smokers and nonsmokers separately, allowing you to compare the age distributions between the two groups visually

22
Q

what is the comma used for in [smoke100==1,]?

A

used to separate row indices from column indices. In this case, we’re leaving the column part empty, meaning we’re selecting all columns

23
Q

sapply()

A

simplify apply, applies the same function to each vector

24
Q

how to solve this using sapply function:

Make a cumulative distribution of the ages of the males or females in the dataset by finding the number of subjects not older than each age in the sequence seq(from = 0,to = 100,by = 5). Use barplot() to make and attach a plot of your results.

A

df.males <- data.frame(Age = seq(0,100,by = 5), Cumulative_Frequency = sapply(seq(0,100,by = 5), function(X) sum(age[gender == “m”] <= X))) # males age cumulative distribution

25
Q

what does adding a type =1 inside of quantile function do?

A

changes to empirical instead (kinda like rounding to the closest number)

26
Q

how to solve this problem : What fraction of men have heights greater than the 90th percentile of height among women? What fraction of women have heights less than the 10th percentile of height among men?

A

quantile(height[gender == “f”], probs = 0.9, type = 1) # 90th percentile of height among females

sum(height[gender == “m”] > quantile(height[gender==”f”], probs = 0.9, type = 1)) / sum(gender == “m”) # fraction of males taller than the tallest females

and then you can go from there for the other one

27
Q
A