I asked about finding subsequences in a variable sum of sets without duplicate characters. The solution was to create a matrix of each pair of letters, discard those that are not found in each set, and then find the longest path in the directional acyclic graph. However, I donβt need only the longest path, I want some of the longest paths (for example, if it generates subsequences of lengths 10, 10, 9, 8, 8, 3, 3, 2 and 1, I may want to display only the first 5 subsequences).
So, since I am not looking only for the longest path to generate the resulting subsequences, and not using the long path algorithm described in the wikipedia article, I use a naive algorithm that simply generates a list of all possible subsequences. This produces a set that is similar to the results in the answer to my previous question.
The problem is that I want to reduce the number of subsequences that it generates.
For example, if I have the following sets:
A = AB12034 B = BA01234 C = AB01234
... my algorithm will currently have the following pairs that occur in each set:
A - 1 B - 1 1 - 2 2 - 3 0 - 3 3 - 4 2 2 3 4 4 3 3 4 4 4 0 0
This is technically correct, but I would like to eliminate some of these pairs. For example, note that 2 always appears after 1 . Therefore, I would like to exclude pairs A2 and B2 (i.e. A and B should not jump directly to 2 ... they should always go through 1 ). In addition, 1 should never go to any number other than 2 , since 2 always happens immediately after it. Also, note that 0 always occurs between B and 3 , so I would like to remove the pair B3 (again, B must always go through 0 before moving to 3 , since all sets have the positions of these three letters as: B < 0 < 3 ).
Just to be clear, the current algorithm will find these subsequences: (for brevity, I have included only those that start with A ):
A1234 A124 * A134 * A14 * A234 * A24 * A34 * A4 * A034 A04 *
... and all those marked * must be eliminated.
(correct) pairs that generate the necessary subsequences will be:
A - 1 B - 1 1 - 2 2 - 3 0 - 3 3 - 4 0 0
... and a complete list of subsequences would be:
A1234 A034 B1234 B034 1234 234 034 34
In other words, I'm trying to move from this oriented acyclic graph:

For this:

What algorithm / logic should I use to get rid of these extraneous pairs (i.e. graphs)? Or do you think that my logic in creating pairs in the first place is what needs to be changed?