Home » Technical Interview Questions » Hashing Interview Questions » Find Top K (or Most Frequent) Numbers in a Stream

Find Top K (or Most Frequent) Numbers in a Stream


Reading Time - 10 mins

In find top k (or most frequent) numbers in a stream problem, we have given an integer array consisting of some numbers. The problem statement says that you have to take an element from the array, and you can only have at most k numbers at the top. We need to print top k elements that are decreasingly sorted based on their frequency.

Example

Input

[3, 1, 4, 6, 1]  k = 4

Output

3, 1, 3, 1, 3, 4, 1, 3, 4, 6, 1, 3, 4, 6

An explanation for top k (or most frequent) numbers in a stream

  • When we read the 1st element of the array that is 3 and till now it is the only element with frequency 1 we print ‘3’.
  • When we read the 2nd element of the array that is 1 and we have 3 and 1with the same frequency i.e, 1 but 1 is smaller than 3 so we print 1 3.
  • In the next step, When we read the 3rd element of the array that is 4 and we have 3, 1, and 4 with the same frequency i.e,1 but sorted order should be 1 3 4.
  • When we read the 4th element of the array that is 6 and we have 3, 1, 4, and 6 with the same frequency i.e,1 but sorted order should be 1 3 4 6.
  • When we read the 5th element of the array that is 1 and we have 3, 1, 4 and 6 all with the same frequency i.e,1 but 1 has frequency greater than 1 so element which has greater frequency should come first so, sorted order should be 1 3 4 6.
READ  Find Number of Employees Under every Employee

Find Top K (or Most Frequent) Numbers in a Stream

Input

[1, 2, 2, 4, 5]  k = 4

Output

1, 1, 2, 2, 1, 2, 1, 4, 2, 1, 4, 5

Explanation for top k (or most frequent) numbers in a stream

  • When we read the 1st element of the array that is 1 and till now it is the only element with frequency 1 we print ‘1’.
  • When we read the 2nd element of the array that is 2 and we have 2 and 1 with the same frequency i.e, 1 but 1 is smaller than 2 so we print 1 2.
  • In the next step, we read the 3rd element of the array that is 2 and we have 2 has a frequency greater than so we print 2 1 as an element with higher frequency comes first.
  • When we read the 4th element of the array that is 4 and we have 4 and 1 with frequency 1, so we print 2 1 4.
  • When we read the 5th element of the array that is 5 and we have 1, 4, and 5 all with the same frequency i.e,1 but 2 has frequency greater than 1 so element which has greater frequency should come first so, sorted order should be 2 1 4 5.

Algorithm

  1. Declare a HashMap.
  2. Declare an array.
  3. Count and store the frequencies of each element into the map.
  4. Put the element into the k+1 position.
  5. Search for the top element and get the position.
  6. Iterate from the position obtained to 0.
  7. Swap if the element of higher frequency is stored beside next to i.
  8. Swap if the frequency is the same but the next element is greater.
  9. Print top k for every iteration of an array.

Explanation for find top k (or most frequent) numbers in a stream

We have given an array and with some number in it. Here, We have to find top k (or most frequent) numbers in a stream, for each iteration. We have given some conditions for it. The thing we should do first is counting each element and its frequency and we will be storing that element with its frequency into a map. For every iteration of the array, we should put the element and its frequency.

READ  Given two unsorted arrays find all pairs whose sum is x

Suppose we have iterated the first element we put into that map with frequency and then copy that array’s element into a temporary array of kth position. We then find out the position of that element in a temporary array and store it to some temporary variable and decrease that temp variable by count 1.

Iterate in the while loop from that position (temporary variable) to 0 and set three conditions in it. If the frequency of that position’s value is less than the next element’s frequency then swap both the elements. If the frequency is found to be similar but the next element is greater than swap it accordingly. Else we need to break the loop.

The basic idea behind that is we need to sort every top k element pair in nondecreasing order, but if any of the element in that pair of top k elements has a frequency greater than any of the other element’s frequency, then we need to put that element first in order. That’s why the number 2 comes first in our order because it repeats itself having a frequency 2, so it was the greatest frequency number in our order.

After swapping and arranging every pair of top k elements we need to print top k (or most frequent) numbers in a stream.

Implementation for find top k (or most frequent) numbers in a stream

C++ Program

#include <bits/stdc++.h>
using namespace std;

void kTop(int arr[], int n, int k)
{
  vector<int> topArray(k + 1);

  unordered_map<int, int> freq;

  for (int l = 0; l < n; l++)
    {
    freq[arr[l]]++;

    topArray[k] = arr[l];
    auto itr = find(topArray.begin(), topArray.end() - 1, arr[l]);

    for (int i = distance(topArray.begin(), itr) - 1; i >= 0; --i)
        {

      if (freq[topArray[i]] < freq[topArray[i + 1]])
        swap(topArray[i], topArray[i + 1]);

      else if ((freq[topArray[i]] == freq[topArray[i + 1]])
          && (topArray[i] > topArray[i + 1]))
        swap(topArray[i], topArray[i + 1]);
      else
        break;
    }
    for (int i = 0; i < k && topArray[i] != 0; ++i)
      cout << topArray[i] << ' ';
  }
  cout << endl;
}
int main()
{
  int k = 4;
  int arr[] = { 5, 2, 1, 3, 2 };
  int n = sizeof(arr) / sizeof(arr[0]);
  kTop(arr, n, k);
  return 0;
}
5 2 5 1 2 5 1 2 3 5 2 1 3 5

Java Program

import java.io.*;
import java.util.*;
class topKElements
{
    public static int findTop(int[] arr, int ele)
    {
        for (int i = 0; i < arr.length; i++)
            if (arr[i] == ele)
                return i;
        return -1;
    }
    public static void printTopElements(int[] a, int n, int k)
    {
        int[] topArray = new int[k + 1];

        HashMap<Integer, Integer> freq = new HashMap<>();
        for (int i = 0; i < k + 1; i++)
            freq.put(i, 0);

        for (int l = 0; l < n; l++)
        {
            if (freq.containsKey(a[l]))
                freq.put(a[l], freq.get(a[l]) + 1);
            else
                freq.put(a[l], 1);

            topArray[k] = a[l];

            int i = findTop(topArray, a[l]);

            i -= 1;

            while (i >= 0)
            {
                if (freq.get(topArray[i]) < freq.get(topArray[i + 1]))
                {
                    int temp = topArray[i];
                    topArray[i] = topArray[i + 1];
                    topArray[i + 1] = temp;
                }
                else if ((freq.get(topArray[i]) == freq.get(topArray[i + 1])) && (topArray[i] > topArray[i + 1]))
                {
                    int temp = topArray[i];
                    topArray[i] = topArray[i + 1];
                    topArray[i + 1] = temp;
                }
                else
                {
                    break;
                }
                i -= 1;
            }
            for (int j = 0; j < k && topArray[j] != 0; j++)
            {
                System.out.print(topArray[j]+" ");
            }
        }
        System.out.println();
    }
    public static void main(String args[])
    {
        int k = 4;
        int[] arr = { 5, 2, 1, 3, 2 };
        int n = arr.length;
        printTopElements(arr, n, k);
    }
}
5 2 5 1 2 5 1 2 3 5 2 1 3 5

Complexity Analysis for find top k (or most frequent) numbers in a stream

Time Complexity

O( n * k )  since in each traversal the temp array of size “k” is traversed “n” is traversed.

READ  K-th Distinct Element in an Array

Space Complexity

O(n) since storing elements in HashMap takes “n” time.

References

Array Interview Questions
Graph Interview Questions
LinkedList Interview Questions
String Interview Questions
Tree Interview Questions