W2: Intro to R and RStudio Flashcards

1
Q

< OR %1%

A

Less than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

<= OR %le%

A

less than or equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

%gl%

A

greater than AND less than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

%gel%

A

greater than or equal AND less than

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

%gle%

A

greater than AND less than or equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

%gele%

A

greater than or equal AND less than or equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

%!in% OR %nin%

A

not in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

.N

A

all sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

d[UserID ! = 56 & NA <= 4]

A

Exclude ID 56, select observation with NA at or below 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 4 data types using class()?

A

Logical, integer, numeric, character

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is logical data?

A

True (1) or False (0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is integer type data?

A

Whole numbers (pos / neg) e.g -1,0,1,2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is numeric type data?

A

Real numbers (whole, decimals, fractions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is character data?

A

Text data, including numbers stored as strings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does this represent D [ i, j, by] ?

A

i = rows, j = columns, by = grouping variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

%Y - %m - %d

A

4 digit 2019 - 03 - 12

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

%d / %m / %y

A

12 / 03 / 19 (2 digit)

18
Q

%Y - %b - %d

A

2019 - Mar - 12

19
Q

What does using factor() need?

A

levels = c(1, 0, 2) and
labels = c(“dog”, “cat”, rabbit”)

20
Q

Name the join and argument used for:
Data with only rows present in both x and y

A

Natural Join, all = FALSE

21
Q

Name the join and argument used for:
Data with all rows in x and y

A

Full Outer Join, all = TRUE

22
Q

Name the join and argument used for:
Data with all rows in x

A

Left Outer Join, all.x = TRUE

23
Q

Name the join and argument used for:
Data with all rows in y

A

Right Outer Join, all.y = TRUE

24
Q

Which join / merge will have most rows?

A

Full Outer Join

25
Q

What do you need to check for before merging if grouping by ID?

A

If there are duplicates

26
Q

What does anyDuplicated( ) do?

A

Returns the position of duplicated data or 0 if no duplicates

27
Q

What does unique(x) %in% unique(y) do?

A

Checks how many IDs from dataset x is in dataset y

28
Q

When is it necessary to reshape data to long format?

A

For RM / longitudinal / panel data

29
Q

What arguments are required when using reshape() to long format?

A

IDs will have multiple rows.
varying = list( stress = c(“stress1”, stress2”)
v.names = “Stress”
timevar = “weeks”
times = c(0, 6, 24) becomes a variable
idvar = “ID”
direction = “long”

30
Q

What arguments are required when using reshape() to wide format?

A

v.names = c (“stress”, “happy”),
time.var = weeks
idvar = “ID”
direction = “wide”

31
Q

How do you merge multiple IDs?

A

by = c(“ID”, “Time”)

32
Q

If data has more extreme large values (upper tail) than extreme small values (lower tail), what kind of skewness is this?

A

Positively skewed

33
Q

If data has more extreme small values (lower tail) than extreme large values (upper tail), what kind of skewness is this?

A

Negatively skewed

34
Q

If there is no skewness (normal distribution), what value will the skewness be?

A

skewness = near 0

35
Q

Skewness of -.93 is positive/negative?

A

Negative

36
Q

Skewness of .76 is positive/negative?

A

Positive

37
Q

z-score is also known as ____ score

A

standard score

38
Q

What is the z-score formula?

A

z = raw score - mean / SD

39
Q

What are 3 measures of variability?

A

range, IQ range, SD

40
Q

What is the default origin date and time in R?

A

1970-01-01
00:00:00

41
Q

What is happening here:
as.numeric(d1[1] - d1[2])/365.25

A

comparison of 2 dates and converted to a number and to years

42
Q

surveys2 <- data.table(
ID = c(1, 2, 2, 3),
Age = c(19, 18, 18, 20))
acti2 <- data.table(
ID = c(2, 2, 3, 4),
Sleep = c(8, 7, 6, 7))
How many rows would full outer join have?

A

7
## ID Age Sleep
## 1: 1 19 NA
## 2: 2 18 8
## 3: 2 18 7
## 4: 2 18 8
## 5: 2 18 7
## 6: 3 20 6
## 7: 4 NA 7