Home » Technical Interview Questions » String Interview Questions » Regular Expression Matching

Regular Expression Matching


Reading Time - 5 mins

In the Regular Expression Matching problem we have given two strings one (let’s assume it x) consists of only lower case alphabets and second (let’s assume it y) consists of lower case alphabets with two special characters i.e, “.” and “*”. The task is to find whether the second string matches the first string or not. If it matches then print “true” otherwise print “false”.

The two characters used here means:

“.” – pattern matches any single character i.e, “.” can be replaced with any character required to match the first string

“*” – pattern matches 0 or more occurrence of character before “*” i.e, “*” can be replaced by any the preceding character zero or more times as required to match the first string

Note:

  • x could be empty or can contain only lowercase letters.
  • y could also be empty or can contain only lowercase letters or only characters like “.” or “*” or both (i.e., lowercase letters and characters like “.” and “*”)

Example

Input

x = “aa”

y = “a”

Output

false

Explanation

As y does not match the entire string i.e., “a” (y) does not match the string “aa” (x).

Input

x = “aa”

y = “a*”

Output

true

Explanation

As “*” means repetition of the preceding letter zero or more times and removing the special character. So by repeating “a” ones in the string “a*” it will become “aa” and both the string will become the same.

READ  Zigzag Conversion

Input

x = “acb”

y = “a.b”

Output

true

Explanation

Here we can replace “.” by any character so replacing “.” in “a.b” by “c” we will get our desired string i.e., acb, and both the string will become true.

Input

x = “ab”

y = “.*”

Output

true

Explanation

“.*” means “zero or more (*) of any character (.)”.

Algorithm

There are many solutions to this problem but the most efficient way is to solve it using dynamic programming.

  1. Get the length of both “x” and “y” in some variable say “m” and “n” respectively.
  2. check if “m” equals to “0” then return print false.
  3. Initialize a 2d array(say dp) with the length of “x” as row and “y” as the column.
  4. Mark the array as false.
  5. Initialize “0,0” index of array as true
  6. Run a for loop till “m” from 1
    1. Inside for loop check, if y[i-1] == ‘*’
      1. If yes then dp[0][i] = dp[0][j-1]
  7. Initialise a nested for loop: One with 1 up to “m” and second with 1 to “n”
    1. In for loop check, if y[j-1]  == ‘*’
      1. if yes then dp[i][j] = dp[i][j – 1] || dp[i – 1][j]
    2. else check if y[j – 1] == ‘.’ || x[i – 1] == y[j – 1]
      1. if yes then dp[i][j] = dp[i – 1][j – 1]
    3. else mark dp[i][j] as false i.e., dp[i][j] == false
  8. Print the result of dp[n][m] i.e., last index of dp

 Implementation in C++ for Regular Expression Matching

#include <bits/stdc++.h>
using namespace std;

bool isMatch(char x[], char y[])
{
    int n = strlen(x);
    int m = strlen(y);

    if (m == 0)
        return (n == 0);

    bool dp[n + 1][m + 1];

    memset(dp, false, sizeof(dp));

    dp[0][0] = true;

    for (int i = 1; i <= m; i++)
    {
        if (y[i - 1] == '*')
        {
            dp[0][i] = dp[0][i - 1];
        }
    }

    for (int i = 1; i <= n; i++)
    {
        for (int j = 1; j <= m; j++)
        {
            if (y[j - 1] == '*')
            {
                dp[i][j] = dp[i][j - 1] || dp[i - 1][j];
            }

            else if (y[j - 1] == '.' || x[i - 1] == y[j - 1])
            {
                dp[i][j] = dp[i - 1][j - 1];
            }

            else
            {
                dp[i][j] = false;
            }
        }
    }

    return dp[n][m];
}

int main()
{
    char x[] = "acb";
    char y[] = "a.b";

    cout<< boolalpha<<(isMatch(x, y));
    return 0;
}
true

Implementation in Java for Regular Expression Matching

import java.util.Arrays;
public class solution {

  static boolean isMatch(String x, String y) {
    int n = x.length();
    int m = y.length();

    if (m == 0)
      return (n == 0);

    boolean[][] dp = new boolean[n + 1][m + 1];

    for (int i = 0; i<n + 1; i++) {
      Arrays.fill(dp[i], false);
    }

    dp[0][0] = true;

    for (int j = 1; j<= m; j++) {
      if (y.charAt(j - 1) == '*') {
        dp[0][j] = dp[0][j - 1];
      }
    }

    for (int i = 1; i<= n; i++) {
      for (int j = 1; j<= m; j++) {

        if (y.charAt(j - 1) == '*') {
          dp[i][j] = dp[i][j - 1] || dp[i - 1][j];
        } else if (y.charAt(j - 1) == '.' || x.charAt(i - 1) == y.charAt(j - 1)) {
          dp[i][j] = dp[i - 1][j - 1];
        } else {
          dp[i][j] = false;
        }
      }
    }

    return dp[n][m];
  }

  public static void main(String args[]) {
    String x = "acb";
    String y = "a.b";

    System.out.println((isMatch(x, y)));

  }
}
true

Complexity Analysis for Regular Expression Matching

Time Complexity

O(m×n) where m is the length of “x” and n is the length of “y”

READ  Longest Common Prefix using Trie

Space Complexity

O(m×n) where m is the length of “x” and n is the length of “y”

References

Array Interview Questions
Graph Interview Questions
LinkedList Interview Questions
String Interview Questions
Tree Interview Questions