Edit Distance  

Difficulty Level Hard
Frequently asked in Amazon ByteDance Facebook Google Microsoft Palantir Technologies Square
Dynamic Programming String

In the edit distance problem we have to find the minimum number of operations required to convert a string X of length n to another string Y of length m.

Operations allowed:

  1. Insertion
  2. Deletion
  3. Substitution

Edit DistancePin

Example  

Input:

Please click Like if you loved this article?

String1 = “abcd”

String2 = “abe”

Output:

Minimum operations required is 2 ( one deletion and one substitution )

We can solve this question by solving it’s sub-problem i.e., find the minimum number of operations required to convert a string X[0,…,i] to Y[0,…j].

Steps  

  1. If the last character of X and Y are the same, then find minimum operations required to convert string X[0,…,n-1] to Y[0,…,m-1].
  2. If the last character of X and Y are not the same, then we can do one of the three operations:
    • Insertion at last index of X.
    • Deletion of the last index of X.
    • Substitution of the last index of X.

Using the above steps, let’s try to draw the recursion tree for this example:

string X= “ab” and Y = “xy” Therefore n=2 and m=2 ed(i,j) denotes minimum operations required to obtained a string Y[1,…,j] from X[1,…,i]. ( 1-based index).

As none of the characters in X is same as Y, hence at every iteration, we will call recursion three times by applying one of the three allowed operations.

See also
Next Permutation

Edit DistancePin

As we can see there are many overlapping sub-problems in the above recursion due to which the worst case complexity of this procedure is O(3^n).

This can be easily reduced using Dynamic Programming which will resolve the issue of re-computation of overlapping sub-problems.

Dynamic Programming Approach  

Base conditions

  1. To convert a string of length n to the given string of length 0, we require n deletion operations.
  2. To convert a string of length 0 to the given string of length n, we require n insertion operations.

At every position we may come across any of these two situations

  1. If the last character of both strings is the same. In this case, we don’t need to do any operation, we can simply say that the answer to this problem will be the same as for the sub-problem editDistance(X[0,…,i-1] , Y[0,….,j-1]).
  2. If the last character is not the same, we have to choose one of these operations:
    • Insertion: In this case, we will insert Y[j] character at ith position and add 1 to the answer of editDistance(X[0,….i], Y[0,….,j-1]). By doing this we ensure that now the last character of both strings is the same.
    • Deletion: In this case, we will delete the ith character of X and add 1 to the answer of editDistance(X[0,….,i-1] , Y[0,…,j]). By doing this we are leaving the last character of X string i.e., deletion is performed.
    • Substitution: In this case we will substitute X[i] by Y[j] and add 1 to the answer of editDistance(X[0,…,i-1] , Y[0,…,j-1]). This becomes the same case as having equal last character in both strings.
See also
Scramble String

As we require the minimum number of operations hence, we will take a minimum of the three possible operations.

C++ Program  

#include<bits/stdc++.h>
using namespace std;
int min(int a,int b,int c){
  return min(a,min(b,c));
}
int editDistance(string X,string Y){
  int n=X.length();
  int m=Y.length();
  int ed[n+1][m+1];

  // to convert a string into null string we need to perform 
  // deletion operation n number of times where n is length of the string
  for(int i=0;i<=n;i++){
    ed[i][0]=i;
  }

  // to convert a null string into the given string we need to perform 
  // insertion operation n number of times where n is length of the string
  for(int i=0;i<=m;i++){
    ed[0][i]=i;
  }

  for(int i=1;i<=n;i++){
    for(int j=1;j<=m;j++){
      if(X[i-1]==Y[j-1]){
        // no operation required
        ed[i][j] = ed[i-1][j-1];
      }
      else{
        // one of the three operation required
        ed[i][j] = 1+ min( ed[i-1][j],   //deletion
                                   ed[i][j-1],   //insertion
                                   ed[i-1][j-1] //substitution
                       );
      }
    }
  }
  return ed[n][m];
}

int main()
{
  string X,Y;
    cin>>X>>Y;
  int result = editDistance(X,Y);
    cout<<"Minimum operations required to convert string "<<X<<" into string "<<Y<<" is: ";
  cout<<result<<"\n";
  return 0;
}
programming
problem
Minimum operations required to convert string programming into string problem is: 7

JAVA Program  

import java.util.Scanner;
class Main { 
    static int min(int x, int y, int z) 
    { 
        return Math.min(x,Math.min(y,z));
    }
    static int editDistance(String X, String Y) 
    { 
        int n=X.length();
    	int m=Y.length();
    	int ed[][] = new int[n + 1][m + 1]; 
    
    	// to convert a string into null string we need to perform 
    	// deletion operation n number of times where n is length of the string
    	for(int i=0;i<=n;i++){
    		ed[i][0]=i;
    	}
    
    	// to convert a null string into the given string we need to perform 
    	// insertion operation n number of times where n is length of the string
    	for(int i=0;i<=m;i++){
    		ed[0][i]=i;
    	}
    
    	for(int i=1;i<=n;i++){
    		for(int j=1;j<=m;j++){
    			if(X.charAt(i - 1)==Y.charAt(j - 1)){
    				// no operation required
    				ed[i][j] = ed[i-1][j-1];
    			}
    			else{
    				// one of the three operation required
    				ed[i][j] = 1+ min( ed[i-1][j],     //deletion
                                       ed[i][j-1],    //insertion
                                       ed[i-1][j-1]  //substitution
    					             );
    			}
    		}
    	}
    	return ed[n][m];
    } 
  
    public static void main(String args[]) 
    { 
        String X ,Y;
        Scanner sc = new Scanner(System.in);
        X = sc.nextLine();
        Y = sc.nextLine();
        int result = editDistance(X,Y);
        System.out.println("Minimum operations required to convert string "+X+" into string "+Y+" is: "+result); 
    } 
}
codingAndChill
codeAndGrow
Minimum operations required to convert string codingAndChill into string codeAndGrow is: 8

Complexity Analysis  

Time complexity: O(n*m)

See also
Queries for counts of array elements with values in given range

where n = length of 1st string

m = length of 2nd string

Please click Like if you loved this article?

as we are filling a 2D matrix( ed ) using two nested loops.

References

Please click Like if you loved this article?