The total number of palindrome subsequences per row

Question

The total number of palindrome subsequences per row

The question is:

For each line specified as an input, you need to specify the number of subsequences that are palindromes (not necessarily different). Note that an empty string is not a palindrome. For example, palindromic subsequences "aab" are:

"a", "a", "b", "aa", and the method returns 4.

I had a dynamic programming solution to find the longest palindromic subsequence and therefore tried to extract ideas from it. Failed to get solution. Perhaps even dynamic programming is not required. Suggestions, please.

And one more catch. When the condition “does not have to be different” is removed, can we still count without actually generating all the palindromic subsequences?

+6

algorithm data-structures dynamic-programming

ankitG Mar 04 '15 at 12:00

source share

3 answers

Here's a terrible O (n ^ 4) solution:

Each palindromic subsequence starts at a certain position i and ends at a certain position j> = i such that x [i] = x [j], and its "interior" (all characters except the first and last) is either empty or a palindromic subsequence x [i + 1 .. j-1].

Thus, we can define f (i, j) as the number of palindromic subsequences starting with i and ending with j> = i. Then

 f(i, j) = 0 if x[i] != x[j] f(i, i) = 1 (ie when j = i) f(i, j) = 1 + the sum of f(i', j') over all i < i' <= j' < j otherwise

[EDIT: Fixed for counting palindromic subsequences of length <= 2 too!]

Then the final answer is the sum of f (i, j) for all 1 <= i <= j <= n.

The DP for this is O (n ^ 4), because there are n ^ 2 entries in the table, and calculating each of them takes O (n ^ 2). (It is probably possible to accelerate this, at least to O (n ^ 3), using the fact that x [i]! = X [j] implies f (i, j) = 0.)

+1

j_random_hacker Mar 04 '15 at 14:33

source share

Intuitive O (n ^ 3) solution using DP:

Let each state dp (i, j) represent the number of palindromic subsequences in the string [i ... j] Then a simple recursive formula

 for k in range i, j-1: if(A[j]==A[k]){ dp(i,j) = dp(i,j) + dp(k+1,j-1);

The idea is very simple. To add a new character check if this is the end of a subsequence or not. If the previously calculated smaller subtask has the same character, then it adds the number of subsequences contained in the range (k + 1, j-1). Just take care of corner cases. Adding one as a new added character is also a one-character subsequence. Even if there are no subsequences in the range (k + 1, j-1), you will still get 1 new subsequence of length 2 (for example, “aa”).

0

Vinay malik Nov 04 '16 at 18:48

source share

j_random_hacker · Accepted Answer · 2015-03-04T15:20:18+0000

[EDIT 10/19/2015: an anonymous reviewer pointed out a problem with the formula that prompted me to notice another, even bigger mistake ... Now fixed.]

Now I see how to reduce the solution time to O (n ^ 2) . I will leave my other answer in case this is interesting, as a stepping stone to this. Note. This (also) solution is only for the first part of the problem; I do not see the ability to efficiently count only individual palindromic subsequences (PS).

Instead of counting the number of PSs that begin and end exactly at positions i and j, let them count how many starts with or after i and end at or before j. Call it g (i, j).

We can try to write g (i, j) = g (i, j-1) + g (i + 1, j) + (x [i] == x [j]) * g (i + 1, j- 1) for the case j> i. But that didn’t work, because the first two members will count twice any PS that starts after i and ends before j.

The key insight is to notice that we can easily calculate the number of PSs that start or end at some exact position by subtracting other g () values and possibly adding even more g () values to compensate for double counting. For example, the number of PS starting with exactly i and ending exactly in j is g (i, j) - g (i + 1, j) - g (i, j-1) + g (i + 1, j -1 ): the last term corrects the fact that both the second and third terms count all g (i + 1, j-1) PS, which begin after i and end before j.

Each PS starting with or after i and ending before or before j is in exactly 1 of 4 categories:

It starts after i and ends before j.
It starts with i and ends with j.
It starts after i and ends on j.
It starts with i and ends with j.

g (i + 1, j) counts all PSs in categories 1 or 3, and g (i, j-1) counts all PSs in categories 1 or 2, so their sum g (i + 1, j) + g (i , j-1) counts all PSs in category 2 or 3 once, and all PSs in category 1 twice. Since g (i + 1, j-1) counts all PSs only in category 1, subtracting this to get g (i + 1, j) + g (i, j-1) - g (i + 1, j- 1) gives the total number of PSs in categories 1, 2 and 3. The remaining PSs are those in category 4. If x [i]! = X [j], then this category does not have PS; otherwise, there are exactly the same number of PSs that begin with or after i + 1 and end before or before j-1, namely g (i + 1, j-1), plus one extra for a 2-character sequence x [I] x [J]. [EDIT: Thanks to Tuxdude commentator for 2 fixes here!]

With this in mind, we can express g () in such a way as to change the quadratic case from f () to constant time:

g(i, i) = 1 (ie when j = i) g(i, i+1) = 2 + (x[i] == x[i+1]) (ie 3 iff adjacent chars are identical, otherwise 2) g(i, j) = 0 when j < i (this new boundary case is needed) g(i, j) = g(i+1, j) + g(i, j-1) - g(i+1, j-1) + (x[i] == x[j])*(g(i+1, j-1)+1) when j >= i+2

The final answer is now just g (1, n).

The total number of palindrome subsequences per row

More articles: