Home » Technical Interview Questions » String Interview Questions » Repeated Substring Pattern

Repeated Substring Pattern


Reading Time - 6 mins

In repeated substring patterns we have given a string check if it can be constructed by taking a substring of itself and appending multiple copies of the substring together.

Example

Input 1: str = “abcabcabc”
Output: True
Explanation: “abcabcabc” can be formed by repeatedly appending “abc” to an empty string.

Input 2: str = “xyxy”
Output: True
Explanation: “xyxy” can be formed by repeatedly appending “xy” to an empty string.

Input 3: str = “xyzxy”
Output: False
Explanation: str has no such substring that can be repeatedly appended to an empty string to form str. 

Input 4: str = “Tutorialcup”
Output: False

Types of solution for repeated substring pattern

  1. KMP algorithm
  2. substring search

KMP algorithm

Approach for repeated substring pattern

We use the KMP algorithm to find the longest prefix lps which is also the suffix of the given string. If the length of the input string is some multiple lps and last character of the string matches with the last character of the lps string. Then we have a substring (which is lps itself) of the input string which when repeated (or appended) a certain number of times gives the input string itself.

We will construct an array dp[ ], where dp[i+1] stores length of the longest prefix which is also a suffix up to index i. After which we will check using dp[ ] if there exists any such substring of the input string.

To calculate dp[i], we are using values from dp[i-1 … 0], so this is a dynamic programming approach.

Algorithm

  1. Construct an array dp[ ] of length = n+1, where n = string length. dp[i+1] denotes the length of the longest proper prefix of the string which is also a suffix up to the index = i.
  2. initiate two pointers j = 0 and i = 1, j is used to track the last character of the longest proper prefix, LPP (also a suffix), i is used to tracking the current character during the traversal.
  3. Begin traversing the string from left to right using i as the loop variable.
  4. if str[i] == str[j] => dp[++i] == ++j, if last character of LPP and current character match, simply increase the length of LPP by 1.
  5. Else if the characters don’t match:
    1. if length of LPP is 0 so far, then even first (str[0]) and current (str[i]) character don’t match => dp[i] = 0.
    2. if length of LPP is not 0, then simply recur => j = dp[j].
  6. After the traversal is over, return if the input string length is divisible by LPP length.
READ  Substring With Concatenation Of All Words

repeated substring pattern repeated substring pattern

Implementation

C++ Program

#include <iostream>
#include <bits/stdc++.h>
using namespace std;

/* function that checks if the input string can be generated 
by repeatedly adding a substring of the input string */
bool hasRepeatedSubstring(string str)
{
    int i = 1, j = 0, n = str.size();
    
    /* dp[i+1] stores longest proper prefix 
    upto index i which also a suffix */ 
    vector <int> dp(n+1,0);
    
    /* Traverse the string */
    while(i < n)
    {
        /* if the current character (at index = i) is same as the last 
        character of longest common prefix obtained upto index i-1 */
        if(str[i] == str[j]) 
        dp[++i] = ++j;
        
        /* if characters don't match */
        else 
        {   
            /* when str[0] and str[i] don't match no proper 
            prefix which is also a suffix is possible 
            so length is 0, simply move on to next character*/
            if(j == 0) 
            i++;
            
            /* decrease the length (by 1) of longest proper 
            prefix (also suffix) possible upto index i-1 
            and then match the last character of longest 
            proper prefix to character at current index */ 
            else 
            j = dp[j];
        }
    }
    
    /* check if there is any such prefix of 
    input string that has length that 
    completely divides the input string length */
    return dp[n] && (dp[n]%(n-dp[n]) == 0);
}

int main()
{   
    /* input string */
    string str = "abcabcabc";
    
    if(hasRepeatedSubstring(str))
    cout<<"Formed by repeating substring";
    else
    cout<<"Cannot be formed by repeated substring";
    
    return 0;
}
Formed by repeating substring

Java Program

import java.util.*;
import java.io.*;

class TutorialCup
{
    /* function that checks if the input string can be generated 
    by repeatedly adding a substring of the input string */
    static boolean hasRepeatedSubstring(String str)
    {
        int i = 1, j = 0, n = str.length();
    
        /* dp[i+1] stores longest proper prefix 
        upto index i which also a suffix */ 
        int [] dp = new int[n+1];
        
        /* Traverse the string */
        while(i < n)
        {
            /* if the current character (at index = i) is same as the last 
            character of longest common prefix obtained upto index i-1 */
            if(str.charAt(i) == str.charAt(j)) 
            dp[++i] = ++j;
            
            /* if characters don't match */
            else 
            {   
                /* when str[0] and str[i] don't match no proper 
                prefix which is also a suffix is possible 
                so length is 0, simply move on to next character*/
                if(j == 0) 
                i++;
                
                /* decrease the length (by 1) of longest proper 
                prefix (also suffix) possible upto index i-1 
                and then match the last character of longest 
                proper prefix to character at current index */ 
                else 
                j = dp[j];
            }
        }
        
        /* check if there is any such prefix of 
        input string that has length that 
        completely divides the input string length */
        return dp[n] != 0 && (dp[n]%(n-dp[n]) == 0);
    }
    
    public static void main (String[] args)
    {   
        /* input string */
        String str = "abcabcabc";
        
        if(hasRepeatedSubstring(str))
        System.out.print("Formed by repeating substring");
        else
        System.out.print("Cannot be formed by repeated substring");
    }    
}
Formed by repeating substring

Complexity Analysis for repeated substring pattern

  1. Time Complexity: T(n) = O(n), single traversal of the input string is done.
  2. Space Complexity: A(n) = O(n), for the dp[ ] array used.
READ  Top K Frequent Words

Substring Search

Approach for repeated substring pattern

If the original string has a repeating substring, the repeating substring can be no larger than 1/2 the length of the original string. Ex: “abcabc” would be “abc”.
By doubling the input string and removing the first and last character, i.e. “abcabcabcabc” => “bcabcabcab”, if the original string “abcabc” can be found in “bcabcabcab”, it means that “abcabc” is made up by repeating one of its a substring.

Algorithm

  1. Construct a string pattern adding the input string str twice.
  2. Remove the first and last characters in the pattern.
  3. Find for str in the pattern, if the match is found return True else return False.

repeated substring pattern

Implementation

C++ Program

#include <iostream>
#include <bits/stdc++.h>
using namespace std;

/* function that checks if the input string can be generated 
by repeatedly adding a substring of the input string */
bool hasRepeatedSubstring(string str)
{
    string pattern = str.substr(1) + str.substr(0,str.length()-1);
    return pattern.find(str) != string::npos;
}

int main()
{   
    /* input string */
    string str = "abcabcabc";
    
    if(hasRepeatedSubstring(str))
    cout<<"Formed by repeating substring";
    else
    cout<<"Cannot be formed by repeated substring";
    
    return 0;
}
Formed by repeating substring

Java Program

import java.util.*;
import java.io.*;

class TutorialCup
{
    /* function that checks if the input string can be generated 
    by repeatedly adding a substring of the input string */
    static boolean hasRepeatedSubstring(String str)
    {
        int n = str.length();
        String pattern = str.substring(1) + str.substring(0,n-1);
        return pattern.contains(str);
    }
    
    public static void main (String[] args)
    {   
        /* input string */
        String str = "abcabcabc";
        
        if(hasRepeatedSubstring(str))
        System.out.print("Formed by repeating substring");
        else
        System.out.print("Cannot be formed by repeated substring");
    }    
}
Formed by repeating substring

Complexity Analysis for repeated substring pattern

  1. Time complexity: T(n) = O(n), for sub string search.
  2. Space Complexity: A(n) = O(n), for generating the pattern string out of input string.
READ  Find the Closest Palindrome number

References

Array Interview Questions
Graph Interview Questions
LinkedList Interview Questions
String Interview Questions
Tree Interview Questions