Longest Duplicate Substring in C++



Suppose we have a string S, consider all duplicated contiguous substrings that occur 2 or more times. (The occurrences may overlap.), We have to find the duplicated substring that has the longest possible length. If there is no such substrings, then return a blank string. As the answer may very large, so return in mod 10^9 + 7.

So, if the input is like "ababbaba", then the output will be "bab"

To solve this, we will follow these steps −

  • m := 1e9 + 7

  • Define a function add(), this will take a, b,

  • return ((a mod m) + (b mod m)) mod m

  • Define a function sub(), this will take a, b,

  • return ((a mod m) - (b mod m) + m) mod m

  • Define a function mul(), this will take a, b,

  • return ((a mod m) * (b mod m)) mod m

  • Define an array power

  • Define a function ok(), this will take x, s,

  • if x is same as 0, then −

    • return empty string

  • Define one map called hash

  • current := 0

  • for initialize i := 0, when i < x, update (increase i by 1), do −

    • current := add(mul(current, 26), s[i] - 'a')

  • hash[current] := Define an array (1, 0)

  • n := size of s

  • for initialize i := x, when i < n, update (increase i by 1), do −

    • current := sub(current, mul(power[x - 1], s[i - x] - 'a'))

    • current := add(mul(current, 26), s[i] - 'a')

    • if count is member of hash, then −

      • for all it in hash[current] −

        • if substring of s from it to x - 1 is same as substring of s from i - x + 1 to x - 1, then −

          • return substring of s from it to x - 1

    • Otherwise

      • insert i - x + 1 at the end of hash[current]

  • return empty string

  • From the main method, do the following −

  • ret := empty string

  • n := size of S

  • power := Define an array of size n and fill this with 1

  • for initialize i := 1, when i < n, update (increase i by 1), do −

    • power[i] := mul(power[i - 1], 26)

  • low := 0, high := n - 1

  • while low <= high, do −

    • mid := low + (high - low) /2

    • temp := ok(mid, S)

    • if size of temp is same as 0, then −

      • high := mid - 1

    • Otherwise

      • if size of temp > size of ret, then −

        • ret := temp

      • low := mid + 1

  • return ret

Let us see the following implementation to get better understanding −

Example

 Live Demo

#include <bits/stdc++.h> using namespace std; typedef long long int lli; class Solution {    public:    int m = 1e9 + 7;    int add(lli a, lli b){       return ((a % m) + (b % m)) % m;    }    int sub(lli a, lli b){       return ((a % m) - (b % m) + m) % m;    }    int mul(lli a, lli b){       return ((a % m) * (b % m)) % m;    }    vector<int> power;    string ok(int x, string s){       if (x == 0)       return "";       unordered_map<int, vector<int> > hash;       lli current = 0;       for (int i = 0; i < x; i++) {          current = add(mul(current, 26), s[i] - 'a');       }       hash[current] = vector<int>(1, 0);       int n = s.size();       for (int i = x; i < n; i++) {          current = sub(current, mul(power[x - 1], s[i - x] -          'a'));          current = add(mul(current, 26), s[i] - 'a');          if (hash.count(current)) {             for (auto& it : hash[current]) {                if (s.substr(it, x) == s.substr(i - x + 1, x)) {                   return s.substr(it, x);                }             }          } else {             hash[current].push_back(i - x + 1);          }       }       return "";    }    string longestDupSubstring(string S){       string ret = "";       int n = S.size();       power = vector<int>(n, 1);       for (int i = 1; i < n; i++) {          power[i] = mul(power[i - 1], 26);       }       int low = 0;       int high = n - 1;       while (low <= high) {          int mid = low + (high - low) / 2;          string temp = ok(mid, S);          if (temp.size() == 0) {             high = mid - 1;          } else {             if (temp.size() > ret.size())             ret = temp;             low = mid + 1;          }       }       return ret;    } }; main(){    Solution ob;    cout << (ob.longestDupSubstring("ababbaba")); }

Input

"ababbaba"

Output

bab
Updated on: 2020-06-04T11:06:27+05:30

670 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements