PROBLEM LINK:
DIFFICULTY:
Medium - Hard
PREREQUISITS:
DP , Finite Automata/KMP , Bitmasking
PROBLEM:
Given positive integers $N$, $K$ and $M$, count how many binary strings $S$ of length $N$ exist such that there exist more than $M$ indices $i$ such that $LCP(S[1, i], S[i + 1, N]) ≥ K$ and $1 ≤ i < N$.
QUICK EXPLANATION :-
The problem is equivalent to finding the numbers of strings such that number of occurrence of $S[1,k] in S[k+1,N]$ is $>M$. We will fix the first $k$ bits by using a k-bit mask and solve the problem for each mask differently. Each problem can be solved by using a simple dp having a count of the number of occurrences till now, and the current match of the string alongwith the length. Then either add $0$ or $1$ . The complexity is $ O( 2^K * K * N * M)$.
DETAILED SOLUTION:-
For any string first lets find number of $i$ such that $LCP(S[1, i], S[i + 1, N]) ≥ K$ and $1 ≤ i < N$. If $LCP(S[1, i], S[i + 1, N]) ≥ K$, then it means that $S[i+1,i+K] = S[1,K]$. So basically if for a fixed $K$, this is equal to the number of occurrence of $S[1,k]$ in $S[k+1,N]$. Now for given $N,M,K$ we need to find number of binary strings in which the count of such occurrences are $>M$.
Since $K<=10$. We can solve the problem for each possible $S[1,k]$ ($2^k$ different strings). Let us denote $S[1,k]$ with a bitmask ‘mask’ and solve the subproblem differently for each mask and sum up the result. Now we will use dp to find out the no of such strings of length $N-K$ , in which the mask occurs $>M$ times.
Let $dp[len][matchCount][prefixMatch]$ denote the count of strings of length $len$ , which have the $mask$ occuring in them exactly $matchCount$ times and currently suffix of maximum length $prefixMatch$ matches the prefix of same length of mask.
For the $prefixMatch$ part, we we can use Finite Automata. I guess it can also be done by modifying KMP. Finite automata is more suitable and easy to visualize for this problem. You can read more about pattern matching from finite automata online. For the current prefix of the string, it keeps track of the max length suffix that matches the some prefix of the pattern. We create a transition table $state[a][b]$ for each mask, in which $a$ stands for the current state and $b$ stands for the next character. Now $state[a][b]$ denotes the next state in which it transitions. Here the index of the state denotes the maximum length prefix of pattern which matches with some suffix of the string.
Now whenever a compete match occurs we can update $matchCount$. And sum $dp[N-K][x][y]$ for all $x>M$ , and all $y$. The transitions are as follows:-
dp[a+1][b+(states[str][c][0]==k?1:0)][states[str][c][0]] += (dp[a][b][c]);
dp[a+1][b+(states[str][c][1]==k?1:0)][states[str][c][1]] += (dp[a][b][c]);
where $states[str][x][y]$ denotes the value of $state[x][y]$ for $mask str$.