Top K elements problems Flashcards

1
Q

Top ‘K’ Numbers

Given an unsorted array of numbers, find the ‘K’ largest numbers in it.

Note: For a detailed discussion about different approaches to solve this problem, take a look at Kth Smallest Number.

Example 1
Input: [3, 1, 5, 12, 2, 11], K = 3
Output: [5, 12, 11]

Example 2
Input: [5, 12, 11, -1, 12], K = 3
Output: [12, 11, 12]

A

A brute force solution could be to sort the array and return the largest K numbers. The time complexity of such an algorithm will be O(N*logN) as we need to use a sorting algorithm like Timsort if we use Python’s Collection.sort(). Can we do better than that?

The best data structure that comes to mind to keep track of top ‘K’ elements is Heap. Let’s see if we can use a heap to find a better algorithm.

If we iterate through the array one element at a time and keep ‘K’ largest numbers in a heap such that each time we find a larger number than the smallest number in the heap, we do two things:

  1. Take out the smallest number from the heap, and
  2. Insert the larger number into the heap.

This will ensure that we always have ‘K’ largest numbers in the heap. The most efficient way to repeatedly find the smallest number among a set of numbers will be to use a min-heap. As we know, we can find the smallest number in a min-heap in constant time O(1), since the smallest number is always at the root of the heap. Extracting the smallest number from a min-heap will take O(logN) (if the heap has ‘N’ elements) as the heap needs to readjust after the removal of an element.

As discussed above, the time complexity of this algorithm is O(K*logK+(N-K)*logK), which is asymptotically equal to O(N*logK)

The space complexity will be O(K) since we need to store the top ‘K’ numbers in the heap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Kth Smallest Number

Given an unsorted array of numbers, find Kth smallest number in it.

Please note that it is the Kth smallest number in the sorted order, not the Kth distinct element.

Note: For a detailed discussion about different approaches to solve this problem, take a look at Kth Smallest Number.

Example 1
Input: [1, 5, 12, 2, 11, 5], K = 3
Output: 5 Explanation: The 3rd smallest number is ‘5’, as the first two smaller numbers are [1, 2].

Example 2
Input: [1, 5, 12, 2, 11, 5], K = 4
Output: 5
Explanation: The 4th smallest number is ‘5’, as the first three small numbers are [1, 2, 5].

Example 3
Input: [5, 12, 11, -1, 12], K = 3
Output: 11
Explanation: The 3rd smallest number is ‘11’, as the first two small numbers are [5, -1].

A

This problem follows the Top ‘K’ Numbers pattern but has two differences:

  1. Here we need to find the Kth smallest number, whereas in Top ‘K’ Numbers we were dealing with ‘K’ largest numbers.
  2. In this problem, we need to find only one number (Kth smallest) compared to finding all ‘K’ largest numbers.

We can follow the same approach as discussed in the ‘Top K Elements’ problem. To handle the first difference mentioned above, we can use a max-heap instead of a min-heap. As we know, the root is the biggest element in the max heap. So, since we want to keep track of the ‘K’ smallest numbers, we can compare every number with the root while iterating through all numbers, and if it is smaller than the root, we’ll take the root out and insert the smaller number.

The time complexity of this algorithm is O(K*logK+(N-K)*logK), which is asymptotically equal to O(N*logK)

The space complexity will be O(K) because we need to store ‘K’ smallest numbers in the heap.

An Alternate Approach

Alternatively, we can use a Min Heap to find the Kth smallest number. We can insert all the numbers in the min-heap and then extract the top ‘K’ numbers from the heap to find the Kth smallest number. Initializing the min-heap with all numbers will take O(N) and extracting ‘K’ numbers will take O(KlogN). Overall, the time complexity of this algorithm will be O(N+KlogN) and the space complexity will be O(N).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

‘K’ Closest Points to the Origin

Given an array of points in the a 2D plane, find ‘K’ closest points to the origin.

Example 1
Input: points = [[1,2],[1,3]], K = 1
Output: [[1,2]]
Explanation: The Euclidean distance between (1, 2) and the origin is sqrt(5). The Euclidean distance between (1, 3) and the origin is sqrt(10). Since sqrt(5) < sqrt(10), therefore (1, 2) is closer to the origin.

Example 2
Input: point = [[1, 3], [3, 4], [2, -1]], K = 2
Output: [[1, 3], [2, -1]]

A

The Euclidean distance of a point P(x,y) from the origin can be calculated through the following formula:

√​x​2​​+y​2​​​​​

This problem follows the Top ‘K’ Numbers pattern. The only difference in this problem is that we need to find the closest point (to the origin) as compared to finding the largest numbers.

Following a similar approach, we can use a Max Heap to find ‘K’ points closest to the origin. While iterating through all points, if a point (say ‘P’) is closer to the origin than the top point of the max-heap, we will remove that top point from the heap and add ‘P’ to always keep the closest points in the heap.

The time complexity of this algorithm is (N*logK) as we iterating all points and pushing them into the heap.

The space complexity will be O(K) because we need to store ‘K’ point in the heap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Connect Ropes

Given ‘N’ ropes with different lengths, we need to connect these ropes into one big rope with minimum cost. The cost of connecting two ropes is equal to the sum of their lengths.

Example 1
Input: [1, 3, 11, 5]
Output: 33
Explanation: First connect 1+3(=4), then 4+5(=9), and then 9+11(=20). So the total cost is 33 (4+9+20)

Example 2
Input: [3, 4, 5, 6]
Output: 36
Explanation: First connect 3+4(=7), then 5+6(=11), 7+11(=18). Total cost is 36 (7+11+18)

Example 3
Input: [1, 3, 11, 5, 2]
Output: 42
Explanation: First connect 1+2(=3), then 3+3(=6), 6+5(=11), 11+11(=22). Total cost is 42 (3+6+11+22)

A

In this problem, following a greedy approach to connect the smallest ropes first will ensure the lowest cost. We can use a Min Heap to find the smallest ropes following a similar approach as discussed in Kth Smallest Number. Once we connect two ropes, we need to insert the resultant rope back in the Min Heap so that we can connect it with the remaining ropes.

Given ‘N’ ropes, we need O(N*logN) to insert all the ropes in the heap. In each step, while processing the heap, we take out two elements from the heap and insert one. This means we will have a total of ‘N’ steps, having a total time complexity of O(N*logN).

The space complexity will be O(N) because we need to store all the ropes in the heap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Top ‘K’ Frequent Numbers

Given an unsorted array of numbers, find the top ‘K’ frequently occurring numbers in it.

Example 1
Input: [1, 3, 5, 12, 11, 12, 11], K = 2
Output: [12, 11]
Explanation: Both ‘11’ and ‘12’ apeared twice.

Example 2
Input: [5, 12, 11, 3, 11], K = 2
Output: [11, 5] or [11, 12] or [11, 3]
Explanation: Only ‘11’ appeared twice, all other numbers appeared once.

A

This problem follows Top ‘K’ Numbers. The only difference is that in this problem, we need to find the most frequently occurring number compared to finding the largest numbers.

We can follow the same approach as discussed in the Top K Elements problem. However, in this problem, we first need to know the frequency of each number, for which we can use a HashMap. Once we have the frequency map, we can use a Min Heap to find the ‘K’ most frequently occurring number. In the Min Heap, instead of comparing numbers we will compare their frequencies in order to get frequently occurring numbers

The time complexity of the above algorithm is O(N+N*logK).

The space complexity will be O(N). Even though we are storing only ‘K’ numbers in the heap. For the frequency map, however, we need to store all the ‘N’ numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Frequency Sort

Given a string, sort it based on the decreasing frequency of its characters.

Example 1
Input: “Programming”
Output: “rrggmmPiano”
Explanation: ‘r’, ‘g’, and ‘m’ appeared twice, so they need to appear before any other character.

Example 2
Input: “abcbab”
Output: “bbbaac”
Explanation: ‘b’ appeared three times, ‘a’ appeared twice, and ‘c’ appeared only once.

A

This problem follows the Top ‘K’ Elements pattern, and shares similarities with Top ‘K’ Frequent Numbers.

We can follow the same approach as discussed in the Top ‘K’ Frequent Numbers problem. First, we will find the frequencies of all characters, then use a max-heap to find the most occurring characters.

The time complexity of the above algorithm is O(D*logD) where ‘D’ is the number of distinct characters in the input string. This means, in the worst case, when all characters are unique the time complexity of the algorithm will be O(N*logN) where ‘N’ is the total number of characters in the string.

The space complexity will be O(N), as in the worst case, we need to store all the ‘N’ characters in the HashMap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kth Largest Number in a Stream

Design a class to efficiently find the Kth largest element in a stream of numbers. The class should have the following two things:

  1. The constructor of the class should accept an integer array containing initial numbers from the stream and an integer ‘K’.
  2. The class should expose a function add(int num) which will store the given number and return the Kth largest number.

Example
Input: [3, 1, 5, 12, 2, 11], K = 4
1. Calling add(6) should return ‘5’.
2. Calling add(13) should return ‘6’.
2. Calling add(4) should still return ‘6’.

A

This problem follows the Top ‘K’ Elements pattern and shares similarities with Kth Smallest number.

We can follow the same approach as discussed in the ‘Kth Smallest number’ problem. However, we will use a Min Heap (instead of a Max Heap) as we need to find the Kth largest number.

The time complexity of the add() function will be O(logK) since we are inserting the new number in the heap.

The space complexity will be O(K) for storing numbers in the heap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

‘K’ Closest Numbers

Given a sorted number array and two integers ‘K’ and ‘X’, find ‘K’ closest numbers to ‘X’ in the array. Return the numbers in the sorted order.

Example 1
Input: [5, 6, 7, 8, 9], K = 3, X = 7
Output: [6, 7, 8]

Example 2
Input: [2, 4, 5, 6, 9], K = 3, X = 6
Output: [4, 5, 6]

Example 3
Input: [2, 4, 5, 6, 9], K = 3, X = 10
Output: [5, 6, 9]

A

This problem follows the Top ‘K’ Numbers pattern. The biggest difference in this problem is that we need to find the closest (to ‘X’) numbers compared to finding the overall largest numbers. Another difference is that the given array is sorted.

Utilizing a similar approach, we can find the numbers closest to ‘X’ through the following algorithm:

  1. Since the array is sorted, we can first find the number closest to ‘X’ through Binary Search. Let’s say that number is ‘Y’.
  2. The ‘K’ closest numbers to ‘Y’ will be adjacent to ‘Y’ in the array. We can search in both directions of ‘Y’ to find the closest numbers.
  3. We can use a heap to efficiently search for the closest numbers. We will take ‘K’ numbers in both directions of ‘Y’ and push them in a Min Heap sorted by their absolute difference from ‘X’. This will ensure that the numbers with the smallest difference from ‘X’ (i.e., closest to ‘X’) can be extracted easily from the Min Heap.
  4. Finally, we will extract the top ‘K’ numbers from the Min Heap to find the required numbers.

The time complexity of the above algorithm is O(logN + K*logK). We need O(logN) for Binary Search and O(K*logK) to insert the numbers in the Min Heap, as well as to sort the output array.

The space complexity will be O(K), as we need to put a maximum of 2K numbers in the heap.

The time complexity of the above algorithm is O(logN + K). We need O(logN) for Binary Search and O(K) for finding the ‘K’ closest numbers using the two pointers.

If we ignoring the space required for the output list, the algorithm runs in constant space O(1).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Maximum Distinct Elements

Given an array of numbers and a number ‘K’, we need to remove ‘K’ numbers from the array such that we are left with maximum distinct numbers.

Example 1
Input: [7, 3, 5, 8, 5, 3, 3], and K=2
Output: 3
Explanation: We can remove two occurrences of 3 to be left with 3 distinct numbers [7, 3, 8], we have to skip 5 because it is not distinct and occurred twice. Another solution could be to remove one instance of ‘5’ and ‘3’ each to be left with three distinct numbers [7, 5, 8], in this case, we have to skip 3 because it occurred twice.

Example 2
Input: [3, 5, 12, 11, 12], and K=3
Output: 2
Explanation: We can remove one occurrence of 12, after which all numbers will become distinct. Then we can delete any two numbers which will leave us 2 distinct numbers in the result.

Example 3
Input: [1, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5], and K=2
Output: 3
Explanation: We can remove one occurrence of ‘4’ to get three distinct numbers.

A

This problem follows the Top ‘K’ Numbers pattern, and shares similarities with Top ‘K’ Frequent Numbers.

We can following a similar approach as discussed in Top ‘K’ Frequent Numbers problem:

  1. First, we will find the frequencies of all the numbers.
  2. Then, push all numbers that are not distinct (i.e., have a frequency higher than one) in a Min Heap based on their frequencies. At the same time, we will keep a running count of all the distinct numbers.
  3. Following a greedy approach, in a stepwise fashion, we will remove the least frequent number from the heap (i.e., the top element of the min-heap), and try to make it distinct. We will see if we can remove all occurrences of a number except one. If we can, we will increment our running count of distinct numbers. We have to also keep a count of how many removals we have done.
  4. If after removing elements from the heap, we are still left with some deletions, we have to remove some distinct elements.

Since we will insert all numbers in a HashMap and a Min Heap, this will take O(N*logN) where ‘N’ is the total input numbers. While extracting numbers from the heap, in the worst case, we will need to take out ‘K’ numbers. This will happen when we have at least ‘K’ numbers with a frequency of two. Since the heap can have a maximum of ‘N/2’ numbers, therefore, extracting an element from the heap will take O(logN) and extracting ‘K’ numbers will take O(KlogN). So overall, the time complexity of our algorithm will be O(N*logN + KlogN).

We can optimize the above algorithm and only push ‘K’ elements in the heap, as in the worst case we will be extracting ‘K’ elements from the heap. This optimization will reduce the overall time complexity to O(N*logK + KlogK).

The space complexity will be O(N) as, in the worst case, we need to store all the ‘N’ characters in the HashMap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sum of Elements

Given an array, find the sum of all numbers between the K1’th and K2’th smallest elements of that array.

Example 1
Input: [1, 3, 12, 5, 15, 11], and K1=3, K2=6
Output: 23
Explanation: The 3rd smallest number is 5 and 6th smallest number 15. The sum of numbers coming between 5 and 15 is 23 (11+12).

Example 2
Input: [3, 5, 8, 7], and K1=1, K2=4
Output: 12
Explanation: The sum of the numbers between the 1st smallest number (3) and the 4th smallest number (8) is 12 (5+7).

A

This problem follows the Top ‘K’ Numbers pattern, and shares similarities with Kth Smallest Number.

We can find the sum of all numbers coming between the K1’th and K2’th smallest numbers in the following steps:

  1. First, insert all numbers in a min-heap.
  2. Remove the first K1 smallest numbers from the min-heap.
  3. Now take the next K2-K1-1 numbers out of the heap and add them. This sum will be our required output.

Since we need to put all the numbers in a min-heap, the time complexity of the above algorithm will be O(N*logN) where ‘N’ is the total input numbers.

The space complexity will be O(N), as we need to store all the ‘N’ numbers in the heap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Rearrange String

Given a string, find if its letters can be rearranged in such a way that no two same characters come next to each other.

Example 1
Input: “aappp”
Output: “papap”
Explanation: In “papap”, none of the repeating characters come next to each other.

Example 2
Input: “Programming”
Output: “rgmrgmPiano” or “gmringmrPoa” or “gmrPagimnor”, etc.
Explanation: None of the repeating characters come next to each other.

Example 3
Input: “aapa”
Output: “”
Explanation: In all arrangements of “aapa”, atleast two ‘a’ will come together e.g., “apaa”, “paaa”.

A

This problem follows the Top ‘K’ Numbers pattern. We can follow a greedy approach to find an arrangement of the given string where no two same characters come next to each other.

We can work in a stepwise fashion to create a string with all characters from the input string. Following a greedy approach, we should first append the most frequent characters to the output strings, for which we can use a Max Heap. By appending the most frequent character first, we have the best chance to find a string where no two same characters come next to each other.

So in each step, we should append one occurrence of the highest frequency character to the output string. We will not put this character back in the heap to ensure that no two same characters are adjacent to each other. In the next step, we should process the next most frequent character from the heap in the same way and then, at the end of this step, insert the character from the previous step back to the heap after decrementing its frequency.

Following this algorithm, if we can append all the characters from the input string to the output string, we would have successfully found an arrangement of the given string where no two same characters appeared adjacent to each other.

The time complexity of the above algorithm is O(N*logN) where ‘N’ is the number of characters in the input string.

The space complexity will be O(N), as in the worst case, we need to store all the ‘N’ characters in the HashMap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Rearrange String K Distance Apart

Given a string and a number ‘K’, find if the string can be rearranged such that the same characters are at least ‘K’ distance apart from each other.

Example 1
Input: “mmpp”, K=2
Output: “mpmp” or “pmpm”
Explanation: All same characters are 2 distance apart.

Example 2
Input: “Programming”, K=3
Output: “rgmPrgmiano” or “gmringmrPoa”
Explanation: All same characters are 3 distance apart.

Example 3
Input: “aab”, K=2
Output: “aba”
Explanation: All same characters are 2 distance apart.

Example 4
Input: “aappa”, K=3
Output: “”
Explanation: We cannot find an arrangement of the string where any two ‘a’ are 3 distance apart.

A

This problem follows the Top ‘K’ Numbers pattern and is quite similar to Rearrange String. The only difference is that in the ‘Rearrange String’ the same characters need not be adjacent i.e., they should be at least ‘2’ distance apart (in other words, there should be at least one character between two same characters), while in the current problem, the same characters should be ‘K’ distance apart.

Following a similar approach, since we were inserting a character back in the heap in the next iteration, in this problem, we will re-insert the character after ‘K’ iterations. We can keep track of previous characters in a queue to insert them back in the heap after ‘K’ iterations.

The time complexity of the above algorithm is O(N*logN) where ‘N’ is the number of characters in the input string.

The space complexity will be O(N), as in the worst case, we need to store all the ‘N’ characters in the HashMap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Scheduling Tasks

You are given a list of tasks that need to be run, in any order, on a server. Each task will take one CPU interval to execute but once a task has finished, it has a cooling period during which it can’t be run again. If the cooling period for all tasks is ‘K’ intervals, find the minimum number of CPU intervals that the server needs to finish all tasks.

If at any time the server can’t execute any task then it must stay idle.

Example 1
Input: [a, a, a, b, c, c], K=2
Output: 7
Explanation: a -> c -> b -> a -> c -> idle -> a

Example 2
Input: [a, b, a], K=3
Output: 5
Explanation: a -> b -> idle -> idle -> a

A

This problem follows the Top ‘K’ Elements pattern and is quite similar to Rearrange String K Distance Apart. We need to rearrange tasks such that same tasks are ‘K’ distance apart.

Following a similar approach, we will use a Max Heap to execute the highest frequency task first. After executing a task we decrease its frequency and put it in a waiting list. In each iteration, we will try to execute as many as k+1 tasks. For the next iteration, we will put all the waiting tasks back in the Max Heap. If, for any iteration, we are not able to execute k+1 tasks, the CPU has to remain idle for the remaining time in the next iteration.

The time complexity of the above algorithm is O(N*logN) where ‘N’ is the number of tasks. Our while loop will iterate once for each occurrence of the task in the input (i.e. ‘N’) and in each iteration we will remove a task from the heap which will take O(logN) time. Hence the overall time complexity of our algorithm is O(N*logN).

The space complexity will be O(N), as in the worst case, we need to store all the ‘N’ tasks in the HashMap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Frequency Stack

Design a class that simulates a Stack data structure, implementing the following two operations:

  • push(int num): Pushes the number ‘num’ on the stack.
  • pop(): Returns the most frequent number in the stack. If there is a tie, return the number which was pushed later.

Example
After following push operations: push(1), push(2), push(3), push(2), push(1), push(2), push(5)

  1. pop() should return 2, as it is the most frequent number
  2. Next pop() should return 1
  3. Next pop() should return 2
A

This problem follows the Top ‘K’ Elements pattern, and shares similarities with Top ‘K’ Frequent Numbers.

We can use a Max Heap to store the numbers. Instead of comparing the numbers we will compare their frequencies so that the root of the heap is always the most frequently occurring number. There are two issues that need to be resolved though:

  1. How can we keep track of the frequencies of numbers in the heap? When we are pushing a new number to the Max Heap, we don’t know how many times the number has already appeared in the Max Heap. To resolve this, we will maintain a HashMap to store the current frequency of each number. Thus whenever we push a new number in the heap, we will increment its frequency in the HashMap and when we pop, we will decrement its frequency.
  2. If two numbers have the same frequency, we will need to return the number which was pushed later while popping. To resolve this, we need to attach a sequence number to every number to know which number came first.

In short, we will keep three things with every number that we push to the heap:

  1. number // value of the number 2. frequency // current frequency of the number when it was pushed to the heap 3. sequenceNumber // a sequence number, to know what number came first

The time complexity of push() and pop() is O(logN) where ‘N’ is the current number of elements in the heap.

We will need O(N) space for the heap and the map, so the overall space complexity of the algorithm is O(N).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly