Combining the ordered list

Please allow me to ask this question with an example: Suppose we have the following 3 lists (outlined double quotes for clarity):

L1: (a, c, b, d, f, j) L2: (b, e, j, k) L3: (a, d, e, g, h, j, i) 

The output list may look like any of the following (there are more solutions)

 Lanswer1: (a, c, b, d, e, f, g, h, j, i, k) Lanswer2: (a, c, b, d, f, e, g, h, j, i, k) Lanswer3: (a, c, b, d, e, f, g, h, j, k, i) 

As a result, the resulting ordered set

  • Contains a union of elements from all lists
  • The order of items in all source lists is maintained.

4th list, L4: (b, c, d), when added to an input, should throw an exception (since c comes before b in L1)

I came up with answers to check. Can anyone suggest an algorithm for this? Thank you, M.S.

+6
source share
2 answers

This can be done using topological sorting .

First, build a directed graph from the lists. List items are nodes. The edges go from the first element to the second, from the second to the third, etc.

Depending on how you implement the algorithm, you can get all possible solutions, or just one. In addition, if the graph contains a cycle, then the algorithm will stop with an error.

Here's what the graph from your listings looks like:

graph

A source:

 digraph { { edge [color = "red"] a -> c c -> b b -> d d -> f f -> j } { edge [color = "blue"] b -> e e -> j j -> k } { edge [color = "green"] a -> d d -> e e -> g g -> h h -> j j -> i } } 
+5
source

Here is an attempt. The usual reservations apply.

Basic logic

This is essentially a two-step process.

1. Match 3 lists into one doubly linked list with each node, pointing to the next element, as well as to any brothers and sisters (for example, e / f or i / k), using the second pointer. After the conversion, it should look something like this:

 a->c->b->d->f->g->h->j->k | | ei 

2. Write down a function that takes two elements and the list above and returns a flag indicating whether the first element occurs before, after, or at the same position as the second element. One possible method signature might be

 int getRelativePositionOf(char e1, char e2) //returns -1 if e1 exists before e2, 0 if e1 and e2 are siblings and 1 if e1 exists after e2. 

This will help you check the fourth list and throw an exception if the relative position of any two elements is different from the original data.

Create a linked list

As you can imagine, the first step, i.e. creating a doubly linked list is the hardest part. I did not come up with the most elegant way to create it, but here is one.

  • Put the longest list, L1 on the double list. So, in your example, this is the third list, and now you have

     a->d->e->g->h->j->i 
  • Now iterate over the remaining items and for each item check if it exists in the linked list. If so, ignore it.

  • If it does not exist, then the element must be inserted into the linked list. So, now we need to find out the position at which this element should be added. Let me call this element "X".

  • Go through the length of the linked list and say ā€œYā€ for each item, check in L2 and L3 if there is a position relationship between X and Y. So, for example, if we try to insert 'c' in the above linked list, I would start with the first element of the linked list, that is, ā€œa,ā€ and look in the lists if there is a relationship between ā€œaā€ and ā€œcā€.

  • A relationship is determined if we can determine the relative position of the two elements in the source lists. From the first list, I know that "c" comes after "a". Therefore, "c" should be inserted somewhere after "a". Therefore, we move forward in the linked list and now compare 'd' with 'c'. Again from the first source list, we know that "c" precedes "d". Bingo. Insert 'c' between 'a' and 'd'. So the linked list now becomes

    a-> C-> d → -> e-> G-> H-> J-> i

  • If a relationship cannot be established between two elements, they are brothers and sisters. Therefore, if you try to insert ā€œfā€ in the above list, you should know that it should appear after ā€œdā€ and before ā€œgā€, but there is no information about the relative position of wrt 'e'. So keep 'f' as the brother of 'e'.

  • Do this until all elements are inserted.

Obviously, step 5 is the ugliest and most complex part of the algorithm. But I think that it can be optimized to make it easier and faster. For example, you can create char → position hash maps from your raw lists for faster searches when determining relative positions. In addition, you can also maintain a set of all items that already exist in the linked list, to avoid crawling the linked list if the item already exists. I am sure that there are several ways that can be optimized.

0
source

Source: https://habr.com/ru/post/886990/


All Articles