Elimination of extraneous edges in a directed acyclic graph when searching for the longest paths

I asked about finding subsequences in a variable sum of sets without duplicate characters. The solution was to create a matrix of each pair of letters, discard those that are not found in each set, and then find the longest path in the directional acyclic graph. However, I don’t need only the longest path, I want some of the longest paths (for example, if it generates subsequences of lengths 10, 10, 9, 8, 8, 3, 3, 2 and 1, I may want to display only the first 5 subsequences).

So, since I am not looking only for the longest path to generate the resulting subsequences, and not using the long path algorithm described in the wikipedia article, I use a naive algorithm that simply generates a list of all possible subsequences. This produces a set that is similar to the results in the answer to my previous question.

The problem is that I want to reduce the number of subsequences that it generates.

For example, if I have the following sets:

A = AB12034 B = BA01234 C = AB01234 

... my algorithm will currently have the following pairs that occur in each set:

 A - 1 B - 1 1 - 2 2 - 3 0 - 3 3 - 4 2 2 3 4 4 3 3 4 4 4 0 0 

This is technically correct, but I would like to eliminate some of these pairs. For example, note that 2 always appears after 1 . Therefore, I would like to exclude pairs A2 and B2 (i.e. A and B should not jump directly to 2 ... they should always go through 1 ). In addition, 1 should never go to any number other than 2 , since 2 always happens immediately after it. Also, note that 0 always occurs between B and 3 , so I would like to remove the pair B3 (again, B must always go through 0 before moving to 3 , since all sets have the positions of these three letters as: B < 0 < 3 ).

Just to be clear, the current algorithm will find these subsequences: (for brevity, I have included only those that start with A ):

 A1234 A124 * A134 * A14 * A234 * A24 * A34 * A4 * A034 A04 * 

... and all those marked * must be eliminated.

(correct) pairs that generate the necessary subsequences will be:

 A - 1 B - 1 1 - 2 2 - 3 0 - 3 3 - 4 0 0 

... and a complete list of subsequences would be:

 A1234 A034 B1234 B034 1234 234 034 34 

In other words, I'm trying to move from this oriented acyclic graph:

enter image description here

For this:

enter image description here

What algorithm / logic should I use to get rid of these extraneous pairs (i.e. graphs)? Or do you think that my logic in creating pairs in the first place is what needs to be changed?

+4
source share
3 answers

Also, note that 0 always occurs between B and 3, so I would like to exclude the pair B3 (again, B should always go through 0 before moving to 3, since all sets have the positions of these three letters as: B <0 <3).

Hmm, okay, so if n0 < n1 < n2 stored on all sets, delete all the pairs (n0, n2) ? This can be achieved using this (in pseudo python):

 for(edge in node): if(len(LongestPath(node, edge.Node)) > 1): RemovePair(node, edge.Node) 
+2
source

Easy easy. If the graph is not too large, it is probably also quite efficient.

  • For each node (start with nodes without incoming edges), follow all the paths, marking the distances, mark all direct children with 1 and put them in the queue. While the queue is not empty, pull out node n , let d be the distance from the beginning. Look at all your direct children, if any one is marked with 1, remove the edge from the beginning to this, put all the children n in the queue marked by the distance d+1 . Pull the next node out of the queue.

What JSPerfUnkn0wn said, only in a little more detail.

+1
source

Since the graph is acyclic, perhaps the solution uses your favorite shortest path algorithm (Bellman-frod, Floyd-Warshal, etc.), but provided that the comparison condition is reversed (so that longer paths win shorter paths).

0
source

Source: https://habr.com/ru/post/1388777/


All Articles