r_sorting Flashcards

1
Q

How do you do the following:

  • create a vector composed of (31, 4, 15, 92, 65)
  • put the vector in ascending order
A
  • create vector: x <- c(31, 4, 15, 92, 65)
  • sort: sort(x)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you get the order of indexes (from smallest value to largest value) of a data frame?

A
  • order(x)
  • Example:
    • > x
      [1] 31 4 15 92 65
      > order(x)
      [1] 2 3 1 5 4
    • The “order” function displays index order based on numerical values from smallest to largest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you do the following:

  • Store the order of the variable “total” from the “murders” data frame within the variable “index”
  • Display the “abb” variable from the “murders” data frame by “index”
A
  • index <- order(murders$total)
  • murders$abb[index]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you do the following:

  • Display the largest value within the “total” variable/column from the “murders” data set
  • Display the index (and store in the variable “i_max”) of the largest value within the “total” variable/column from the “murders” data set
  • Display the “state” name of the value at the index stored in “i_max”
A
  • max(murders$total)
  • i_max <- which.max(murders$total)
  • murders$state[i_max]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Given the vector (31, 4, 15, 92, 65), answer the following:

  • What is the “sort” of the vector?
  • What is the “order” of the vector?
  • What is the “rank” of the vector?
A
  • original: 31, 4, 15, 92, 65
  • sort: 4, 15, 31, 65, 92
  • order: 2, 3, 1, 5, 4
    • This is the index of the original vector (before being sorted)
  • rank: 3, 1, 2, 5, 4
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Using the “murders” dataset, do the following:

  1. Use the $ operator to access the population size data and store it the object pop.
  2. Then use the sort function to redefine pop so that it is sorted.
  3. Finally use the [ operator to report the smallest population size.
A
  1. pop <- murders$population
  2. pop <- sort(pop)
  3. pop[1]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Using the “murders” data set, do the following:

  1. Now instead of the smallest population size, let’s find out the row number, in the data frame murders, of the state with the smallest population size.
  2. This time we need to replace the order() instead of sort().
  3. Remember that the entries in the vector murders$population follow the order of the rows of murders.
A
  1. # Access population from the dataset and store it in pop
    • pop <- murders$population
  2. # Use the command order, to order pop and store in object o
    • o <- order(pop)
  3. # Find the index number of the entry with the smallest population size
    • o[1]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Using the “murders” data set, do the following:

  1. Write one line of code that gives the index of the lowest population entry. Use the which.min command.
A
  1. which.min(murders$population)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Using the “murders” data set, do the following:

  1. Find the index of the smallest state using which.min(murders$population).
  2. Define a variable states to hold the state names from the murders data frame.
  3. Combine these to find the state name for the smallest state
A
  1. # Define the variable i to be the index of the smallest state
    • i <- which.min(murders$population)
  2. # Define variable states to hold the states
    • states <- murders$state
  3. # Use the index you just defined to find the state with the smallest population
    • murders$state[i]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Using the “murders” data set, do the following:

  1. Define a variable states to be the state names from murders
  2. Use rank(murders$population) to determine the population size rank (from smallest to biggest) of each state.
    • Save these ranks in an object called ranks.
  3. Create a data frame with state names and their respective ranks. Call the data frame my_df.
A
  1. # Define a variable states to be the state names
    • states <- murders$state
  2. # Define a variable ranks to determine the population size ranks
    • ranks <- rank(murders$population)
  3. # Create a data frame my_df with the state name and its rank
    • my_df <- data.frame(name = states, ranks = ranks)
    • my_df
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Using the “murders” data set, do the following:

  1. Create variables states and ranks to store the state names and ranks by population size respectively.
  2. Create an object ind that stores the indexes needed to order the population values, using the order command. For example we could define o <- order(murders$population)
  3. Create a data frame with both variables following the correct order. Use the bracket operator [to re-order each column in the data frame. For example, states[o] orders the abbreviations based by population size.
  4. The columns of the data frame must be in the specific order: state, rate, rank.
A
  1. # Define a variable states to be the state names from the murders data frame
    • states <- murders$state
  2. # Define a variable ranks to determine the population size ranks
    • ranks <- rank(murders$population)
  3. # Define a variable ind to store the indexes needed to order the population values
    • ind <- order(murders$population)
  4. # Create a data frame my_df with the state name and its rank and ordered from least populous to most
    • my_df <- data.frame(state=states[ind], rank=ranks[ind])
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do the following:

  1. Import the “dslabs” library
    • Import the “na_example” dataset
  2. Check the structure of the “na_example” dataset
  3. Find the mean of the na_example dataset
  4. The is.na returns a logical vector that tells us which entries are NA. Assign the logical vector that is returned by is.na(na_example) to an object called ind.
  5. Determine how many NAs na_example has, using the sum command.
  6. Write one line of code to compute the average, but only for the entries that are not NA making use of the ! operator before ind.
A
  1. # Using new dataset
    • library(dslabs)
    • data(na_example)
  2. # Checking the structure
    • str(na_example)
  3. # Find out the mean of the entire dataset
    • mean(na_example)
  4. # Use is.na to create a logical index ind that tells which entries are NA
    • ind <- is.na(na_example)
  5. # Determine how many NA ind has using the sum function
    • sum(ind)
  6. # Compute the average, for entries of na_example that are not NA
    • mean(na_example[!ind])
How well did you know this?
1
Not at all
2
3
4
5
Perfectly