Home » Technical Interview Questions » Algorithm Interview Questions » Find Median from data Stream

# Find Median from data Stream

In Find Median from the data Stream problem, we have given that integers are being read from a data stream. Find the median of all the elements read so far starting from the first integer till the last integer.

## Example

```Input 1: stream[ ] = {3,10,5,20,7,6}
Output : 3    6.5     5     7.5     7     6.5

Input 2: stream[ ] = {20,1,11,19,21,17,6}
Output : 20     10.5     11     15     19     18     17```

Median: it can be defined as the element in the data set which separates the higher half of the data sample from the lower half. When the size of input data is odd, the median of input data is the middle element of sorted input data. When the size of the input data is even, the median is the average middle two elements in sorted input data.

## Types of Solution

1. insertion sort
2. using a heap data structure
3. using ordered multiset data structure(Tree set with multiple same values)

### Insertion Sort

The idea is to keep elements already received from the stream in sorted order. Once we receive a new element from the stream, we find it’s correct place in the sorted order and place the new element at the correct place to find the median of the newly obtained sorted order. This is what we do in the insertion sort algorithm.

#### C++ Program for Find Median from data Stream

```#include <iostream>
#include <bits/stdc++.h>
using namespace std;

// placing the last element recieved at it's correct position in a sorted vector
void insertionSort(vector <int>& sorted)
{
int last_inserted = sorted.size()-1;
while(last_inserted > 0 && sorted[last_inserted] < sorted[last_inserted-1])
{
swap(sorted[last_inserted-1],sorted[last_inserted]);
last_inserted--;
}
}

// prints median out of a given stream of integer values
void printMedian(vector <int> stream)
{
// vector to store values in sorted order
// for printing of the median value
vector <int> sorted;

for(int i=0;i<stream.size();i++)
{
sorted.push_back(stream[i]);

if(sorted.size() == 1)
cout<<sorted<<"\t";

else
{
// sort the sorted vector
insertionSort(sorted);

// if number of elements recieved is odd
// middle element is the median
if(sorted.size()%2 == 1)
{
int mid = sorted.size()/2;

cout<<sorted[mid]<<"\t";
}
// if size is even
// average of middle two elements is the median
else
{
int mid1 = (sorted.size()-1)/2;
int mid2 = sorted.size()/2;

cout<<(float)(sorted[mid1]+sorted[mid2])/2<<"\t";
}
}
}
}

// main function to implement median of stream of integers
int main()
{
vector <int> stream  = {3,10,5,20,7,6};
printMedian(stream);
return 0;
}```
`3	6.5	5	7.5	7	6.5`

#### Java Program for Find Median from data Stream

```import java.io.*;
import java.util.*;

class tutorialcup
{
// placing the last element recieved at it's correct position in a sorted vector
static void insertionSort(ArrayList <Integer> sorted)
{
int last_inserted = sorted.size()-1;
while(last_inserted > 0 && sorted.get(last_inserted) < sorted.get(last_inserted-1))
{
int temp = sorted.get(last_inserted-1);
sorted.set(last_inserted-1,sorted.get(last_inserted));
sorted.set(last_inserted,temp);

last_inserted--;
}
}

// prints median out of a given stream of integer values
static void printMedian(ArrayList <Integer> stream)
{
// vector to store values in sorted order
// for printing of the median value
ArrayList <Integer> sorted = new ArrayList <Integer>();

for(int i=0;i<stream.size();i++)
{

if(sorted.size() == 1)
System.out.print(sorted.get(0)+"\t");
else
{
// sort the sorted vector
insertionSort(sorted);

// if number of elements recieved is odd
// middle element is the median
if(sorted.size() %2 == 1)
{
int mid = sorted.size()/2;
System.out.print(sorted.get(mid)+"\t");
}

// if size is even
// average of middle two elements is the median
else
{
int mid1 = (sorted.size()-1)/2;
int mid2 = sorted.size()/2;

float median = (float)(sorted.get(mid1)+sorted.get(mid2))/2;

System.out.print(median + "\t");
}
}
}
}
// main function to implement median of stream of integers
public static void main (String[] args)
{
ArrayList <Integer> stream = new ArrayList <Integer> (Arrays.asList(3,10,5,20,7,6));
printMedian(stream);
}
}```
`3	6.5	5	7.5	7	6.5`

#### Complexity Analysis

1. Time complexity: T(n) = O(n2), linear sorting(insertion sort) is nested with array traversal.
2. Space Complexity: A(n) = O(1), no space is taken other than storing the stream.

### Using Heap Data structure

The idea is to use max heap and min-heap data structure to store the elements of the lower half and higher half respectively. Using the root values of both the heaps, we are able to calculate the median of the running stream of integers. The algorithm is implemented in the following ways :

#### Algorithm

1. Create two heaps. One max heap (maxheap) to maintain elements of the lower half and one min heap(minheap) to maintain elements of the higher half at any point in time.
2. Initially, the value of median is 0.
3. For the current element received from the stream insert it into either of the heaps and calculate the median described in the following statements.
4. If the size of both the heaps is the same.
• if the current element is greater than the median value, insert it into min heap.return the root of minheap as the new median.
• else if the current element is less than the median value, insert it into max heap.return the root of maxheap as the new median.
5. else If the size of maxheap is greater than minheap :
• if the current element is greater than the median, insert the current element in minheap.
• else if the current element is less than the median, pop the root of maxheap and insert it into minheap. Now insert the current element to maxheap.
• calculate the median as an average of the root of minheap and maxheap.
6. else If the size of maxheap is less than minheap :
• if the current element is less than the median, insert the current element into maxheap.
• else if the current element is greater than the median, pop the top of minheap and insert it into maxheap. Now insert the current element to minheap.
• calculate the median as an average of the root of minheap and maxheap.  #### C++ Program for Find Median from data Stream

```#include <iostream>
#include <bits/stdc++.h>
using namespace std;

// prints median out of a given stream of integer values
void printMedian(vector <int> stream)
{
// initial value of median is 0
float median = 0;
// to store the lower half of sorted stream
priority_queue <int> maxheap;
// to store the upper half of sorted stream
priority_queue <int,vector <int>,greater <int> > minheap;

// process the stream of values
for(int i=0;i<stream.size();i++)
{
int curr = stream[i];

// if size of both the heap is same
if(maxheap.size() == minheap.size())
{
if(curr >= median)
{
minheap.push(curr);
median = minheap.top();
}
else
{
maxheap.push(curr);
median = maxheap.top();
}
}

// if size of heaps are different
// after inserting the element from the stream
// size of both heap becomes equal
// median is average of roots of both the heaps
else
{
if(maxheap.size() > minheap.size())
{
if(curr > median)
minheap.push(curr);
else
{
minheap.push(maxheap.top());
maxheap.pop();
maxheap.push(curr);
}
}

else
{
if(curr < median)
maxheap.push(curr);
else
{
maxheap.push(minheap.top());
minheap.pop();
minheap.push(curr);
}
}

median = (float)(minheap.top()+maxheap.top())/2;
}

cout<<median<<"\t";
}
}

// main function to implement median of stream of integers
int main()
{
vector <int> stream  = {3,10,5,20,7,6};
printMedian(stream);
return 0;
}```
`3	6.5	5	7.5	7	6.5`

#### Java Program for Find Median from data Stream

```import java.io.*;
import java.util.*;

class tutorialcup
{
// prints median out of a given stream of integer values
static void printMedian(ArrayList <Integer> stream)
{
// initial value of median is 0
float median = 0;
// to store the lower half of sorted stream
PriorityQueue <Integer> minheap = new PriorityQueue<Integer>();
// to store the upper half of sorted stream
PriorityQueue <Integer> maxheap = new PriorityQueue<Integer>(Collections.reverseOrder());

// process the stream of values
for(int i=0;i<stream.size();i++)
{
int curr = stream.get(i);

// if size of both the heap is same
if(maxheap.size() == minheap.size())
{
if(curr >= median)
{
median = minheap.peek();
}
else
{
median = maxheap.peek();
}
}

// if size of heaps are different
// after inserting the element from the stream
// size of both heap becomes equal
// median is average of roots of both the heaps
else
{
if(maxheap.size() > minheap.size())
{
if(curr > median)
else
{
}
}

else
{
if(curr < median)
else
{
}
}

median = (float)(minheap.peek()+maxheap.peek())/2;
}

System.out.print(median+"\t");
}
}
// main function to implement median of stream of integers
public static void main (String[] args)
{
ArrayList <Integer> stream = new ArrayList <Integer> (Arrays.asList(3,10,5,20,7,6));
printMedian(stream);
}
}```
`3	6.5	5	7.5	7	6.5`

#### Complexity Analysis

1. Time complexity : T(n) = O(nlogn), for formation of heap of n elements.
2. Space Complexity : A(n) = O(n), stream values are stored in a heap.

### Using Ordered Multiset Data Structure

We use a multiset data structures with two iterator type pointers left and right, as and when we insert an element into the multiset, we modify these pointers to point at the middle element of the sorted stream. This is performed as below :

#### Algorithm

1. creates a multiset variable sorted.
2. create two iterators for sorted multiset left and right.
3. Process each element (our current element) of the stream and insert the element into the sorted.
4. if the size of sorted is 1 (the first element is inserted), point left and right to the first element of sorted.
5. Else
1. if sizeof(sorted) is even, ie left and right point to the same element in the middle.
• if the current element is greater than the element pointed by right, advance right to next element.
• else if the current element is less than element pointed by right, move back left to the previous element.
2. if sizeof(sorted) is odd, ie left and right point to consecutive elements in the middle of the array.
• if the current element is greater than the element pointed by right, advance left to next element.
• else if the current element is less than element pointed by left, move right to the previous element.
• else if the current element is greater than left and less than right, move left forward and right to backward.
6. calculate median at every step by using following : median = (left+right)/2. #### C++ Program for Find Median from data Stream

```#include <iostream>
#include <bits/stdc++.h>
using namespace std;

// prints median out of a given stream of integer values
void printMedian(vector <int> stream)
{
// declare multiset to store sorted order
multiset <int> sorted;
// declare iterators of multiset type
multiset<int>::iterator left,right;
// initialize median
float median = 0;
// process the stream
for(int i=0;i<stream.size();i++)
{
// insert each element into multiset
sorted.insert(stream[i]);

// for first element inserted
// initiate left and right pointers
if(sorted.size() == 1)
{
left = sorted.begin();
right = sorted.begin();
median = *left;
}
// for subsequent elements inserted
else
{
// if size of multiset is even
if(sorted.size()%2 == 0)
{
if(stream[i] >= *right)
right++;
else
left--;
}
// if size of multiset is odd
else
{
if(stream[i] >= *right)
left++;
else if(stream[i] <= *left)
right--;
else
{
left++;
right--;
}
}
}

// median is average of elements pointed by left and right
median = (float)(*left+*right)/2;

cout<<median<<"\t";
}
}

// main function to implement median of stream of integers
int main()
{
vector <int> stream  = {3,10,5,20,7,6};
printMedian(stream);
return 0;
}```
`3	6.5	5	7.5	7	6.5`

#### Java Program for Find Median from data Stream

```import java.io.*;
import java.util.*;
// library to access ordered multiset in java
class tutorialcup
{
// prints median out of a given stream of integer values
static void printMedian(ArrayList <Integer> stream)
{
// declare multiset to store sorted order
TreeMultiset <Integer> sorted = TreeMultiset.create();
// declare iterators of multiset type
Iterator <Integer> left = sorted.iterator();
Iterator <Integer> right = sorted.iterator();
// initialize median
float median = 0;
// process the stream
for(int i=0;i<stream.size();i++)
{
// insert each element into multiset

// for first element inserted
// initiate left and right pointers
if(sorted.size() == 1)
{
//left = sorted.begin();
//right = sorted.begin();
median = left.next();
}
// for subsequent elements inserted
else
{
// if size of multiset is even
if(sorted.size()%2 == 0)
{
if(stream.get(i) >= right.next())
right.hasNext();
else
left.hasPrevious();
}
// if size of multiset is odd
else
{
if(stream.get(i) >= right.next())
left.hasNext();
else if(stream.get(i) <= left.next())
right.hasPrevious();
else
{
left.hasNext();
right.hasPrevious();
}
}
}

// median is average of elements pointed by left and right
median = (float)(left.next()+right.next())/2;
System.out.print(median+"\t");
}
}
// main function to implement median of stream of integers
public static void main (String[] args)
{

ArrayList <Integer> stream = new ArrayList <Integer> (Arrays.asList(3,10,5,20,7,6));
printMedian(stream);
}
}```
`3	6.5	5	7.5	7	6.5`

#### Complexity Analysis

1. Time complexity : T(n) = O(nlogn), for formation of multiset of n elements.
2. Space Complexity : A(n) = O(n), stream values are stored in a set.