W2: Intro to R and RStudio Flashcards

1
Q

< OR %1%

A

Less than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

<= OR %le%

A

less than or equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

%gl%

A

greater than AND less than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

%gel%

A

greater than or equal AND less than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

%gle%

A

greater than AND less than or equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

%gele%

A

greater than or equal AND less than or equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

%!in% OR %nin%

A

not in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

.N

A

all sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

d[UserID ! = 56 & NA <= 4]

A

Exclude ID 56, select observation with NA at or below 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 4 data types using class()?

A

Logical, integer, numeric, character

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is logical data?

A

True (1) or False (0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is integer type data?

A

Whole numbers (pos / neg) e.g -1,0,1,2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is numeric type data?

A

Real numbers (whole, decimals, fractions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is character data?

A

Text data, including numbers stored as strings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does this represent D [ i, j, by] ?

A

i = rows, j = columns, by = grouping variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

%Y - %m - %d

A

4 digit 2019 - 03 - 12

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

%d / %m / %y

A

12 / 03 / 19 (2 digit)

18
Q

%Y - %b - %d

A

2019 - Mar - 12

19
Q

What does using factor() need?

A

levels = c(1, 0, 2) and
labels = c(“dog”, “cat”, rabbit”)

20
Q

Name the join and argument used for:
Data with only rows present in both x and y

A

Natural Join, all = FALSE

21
Q

Name the join and argument used for:
Data with all rows in x and y

A

Full Outer Join, all = TRUE

22
Q

Name the join and argument used for:
Data with all rows in x

A

Left Outer Join, all.x = TRUE

23
Q

Name the join and argument used for:
Data with all rows in y

A

Right Outer Join, all.y = TRUE

24
Q

Which join / merge will have most rows?

A

Full Outer Join

25
What do you need to check for before merging if grouping by ID?
If there are duplicates
26
What does anyDuplicated( ) do?
Returns the position of duplicated data or 0 if no duplicates
27
What does unique(x) %in% unique(y) do?
Checks how many IDs from dataset x is in dataset y
28
When is it necessary to reshape data to long format?
For RM / longitudinal / panel data
29
What arguments are required when using reshape() to long format?
IDs will have multiple rows. varying = list( stress = c("stress1", stress2") v.names = "Stress" timevar = "weeks" times = c(0, 6, 24) becomes a variable idvar = "ID" direction = "long"
30
What arguments are required when using reshape() to wide format?
v.names = c ("stress", "happy"), time.var = weeks idvar = "ID" direction = "wide"
31
How do you merge multiple IDs?
by = c("ID", "Time")
32
If data has more extreme large values (upper tail) than extreme small values (lower tail), what kind of skewness is this?
Positively skewed
33
If data has more extreme small values (lower tail) than extreme large values (upper tail), what kind of skewness is this?
Negatively skewed
34
If there is no skewness (normal distribution), what value will the skewness be?
skewness = near 0
35
Skewness of -.93 is positive/negative?
Negative
36
Skewness of .76 is positive/negative?
Positive
37
z-score is also known as ____ score
standard score
38
What is the z-score formula?
z = raw score - mean / SD
39
What are 3 measures of variability?
range, IQ range, SD
40
What is the default origin date and time in R?
1970-01-01 00:00:00
41
What is happening here: as.numeric(d1[1] - d1[2])/365.25
comparison of 2 dates and converted to a number and to years
42
surveys2 <- data.table( ID = c(1, 2, 2, 3), Age = c(19, 18, 18, 20)) acti2 <- data.table( ID = c(2, 2, 3, 4), Sleep = c(8, 7, 6, 7)) How many rows would full outer join have?
7 ## ID Age Sleep ## 1: 1 19 NA ## 2: 2 18 8 ## 3: 2 18 7 ## 4: 2 18 8 ## 5: 2 18 7 ## 6: 3 20 6 ## 7: 4 NA 7