r_sorting Flashcards

Question 1

Q

How do you do the following:

create a vector composed of (31, 4, 15, 92, 65)
put the vector in ascending order

Answer

A

create vector: x <- c(31, 4, 15, 92, 65)
sort: sort(x)

Question 2

Q

How do you get the order of indexes (from smallest value to largest value) of a data frame?

Answer

A

order(x)
Example:
- > x
  [1] 31 4 15 92 65
  > order(x)
  [1] 2 3 1 5 4
- The “order” function displays index order based on numerical values from smallest to largest

Question 3

Q

How do you do the following:

Store the order of the variable “total” from the “murders” data frame within the variable “index”
Display the “abb” variable from the “murders” data frame by “index”

Answer

A

index <- order(murders$total)
murders$abb[index]

Question 4

Q

How do you do the following:

Display the largest value within the “total” variable/column from the “murders” data set
Display the index (and store in the variable “i_max”) of the largest value within the “total” variable/column from the “murders” data set
Display the “state” name of the value at the index stored in “i_max”

Answer

A

max(murders$total)
i_max <- which.max(murders$total)
murders$state[i_max]

Question 5

Q

Given the vector (31, 4, 15, 92, 65), answer the following:

What is the “sort” of the vector?
What is the “order” of the vector?
What is the “rank” of the vector?

Answer

A

original: 31, 4, 15, 92, 65
sort: 4, 15, 31, 65, 92
order: 2, 3, 1, 5, 4
- This is the index of the original vector (before being sorted)
rank: 3, 1, 2, 5, 4

Question 6

Q

Using the “murders” dataset, do the following:

Use the $ operator to access the population size data and store it the object pop.
Then use the sort function to redefine pop so that it is sorted.
Finally use the [ operator to report the smallest population size.

Answer

A

pop <- murders$population
pop <- sort(pop)
pop[1]

Question 7

Q

Using the “murders” data set, do the following:

Now instead of the smallest population size, let’s find out the row number, in the data frame murders, of the state with the smallest population size.
This time we need to replace the order() instead of sort().
Remember that the entries in the vector murders$population follow the order of the rows of murders.

Answer

A

# Access population from the dataset and store it in pop
- pop <- murders$population
# Use the command order, to order pop and store in object o
- o <- order(pop)
# Find the index number of the entry with the smallest population size
- o[1]

Question 8

Q

Using the “murders” data set, do the following:

Write one line of code that gives the index of the lowest population entry. Use the which.min command.

Answer

A

which.min(murders$population)

Question 9

Q

Using the “murders” data set, do the following:

Find the index of the smallest state using which.min(murders$population).
Define a variable states to hold the state names from the murders data frame.
Combine these to find the state name for the smallest state

Answer

A

# Define the variable i to be the index of the smallest state
- i <- which.min(murders$population)
# Define variable states to hold the states
- states <- murders$state
# Use the index you just defined to find the state with the smallest population
- murders$state[i]

Question 10

Q

Using the “murders” data set, do the following:

Define a variable states to be the state names from murders
Use rank(murders$population) to determine the population size rank (from smallest to biggest) of each state.
- Save these ranks in an object called ranks.
Create a data frame with state names and their respective ranks. Call the data frame my_df.

Answer

A

# Define a variable states to be the state names
- states <- murders$state
# Define a variable ranks to determine the population size ranks
- ranks <- rank(murders$population)
# Create a data frame my_df with the state name and its rank
- my_df <- data.frame(name = states, ranks = ranks)
- my_df

Question 11

Q

Using the “murders” data set, do the following:

Create variables states and ranks to store the state names and ranks by population size respectively.
Create an object ind that stores the indexes needed to order the population values, using the order command. For example we could define o <- order(murders$population)
Create a data frame with both variables following the correct order. Use the bracket operator [to re-order each column in the data frame. For example, states[o] orders the abbreviations based by population size.
The columns of the data frame must be in the specific order: state, rate, rank.

Answer

A

# Define a variable states to be the state names from the murders data frame
- states <- murders$state
# Define a variable ranks to determine the population size ranks
- ranks <- rank(murders$population)
# Define a variable ind to store the indexes needed to order the population values
- ind <- order(murders$population)
# Create a data frame my_df with the state name and its rank and ordered from least populous to most
- my_df <- data.frame(state=states[ind], rank=ranks[ind])

Question 12

Q

Do the following:

Import the “dslabs” library
- Import the “na_example” dataset
Check the structure of the “na_example” dataset
Find the mean of the na_example dataset
The is.na returns a logical vector that tells us which entries are NA. Assign the logical vector that is returned by is.na(na_example) to an object called ind.
Determine how many NAs na_example has, using the sum command.
Write one line of code to compute the average, but only for the entries that are not NA making use of the ! operator before ind.

Answer

A

# Using new dataset
- library(dslabs)
- data(na_example)
# Checking the structure
- str(na_example)
# Find out the mean of the entire dataset
- mean(na_example)
# Use is.na to create a logical index ind that tells which entries are NA
- ind <- is.na(na_example)
# Determine how many NA ind has using the sum function
- sum(ind)
# Compute the average, for entries of na_example that are not NA
- mean(na_example[!ind])

r_sorting Flashcards

(12 cards)