Effective algorithm for checking a subset for a range of sets

I read some posts about whether a set is a Asubset of another set B. But it's hard for me to determine which algorithm to use. Here is the outline of the problem:

  • I have an array of strings Athat I get at the beginning of my program. Little is known about this structure. Each row in the array can be arbitrarily long, and the number of entries is unlimited. Although you can usually assume that the number of entries in the array will not be excessively large (<100).
  • Then I iterate over the list of objects by length n.
  • Each of the objects nwill also have an array of strings B, i.e. there will be arrays n B. After starting the program Bwill be fixed, that is, they will not change at runtime.
  • I want to define for each object if it Ais a subset B.

Now I thought of hash tables. However, they, in my opinion, would be effective only if only one Band many As. Then I could make a hash table for Band check every row array of every object on my hash table. But this is not so, because there is only one A, but n Bs. What would be an efficient algorithm for this?

Example:

A:  ["A", "G", "T"]
B1: ["C", "G"]
B2: ["K", "A", "U", "T", "G"]
.
.
.
Bn: ["T", "I", "G", "O", "L"]

Here Ais a subset B2, but not B1, not Bn.

+4
3

A trie. , A , .

, Bi Bi, A. , ( , ).

B. ,

  • , A, ,

  • , Bi ,

  • , , , Bi.

, , , , .

+2

A , - A.

, . B , A. , , ; , , .

. A, . B A

+1

, () B. , :

  • - A, , B, , B;
  • - A, , B, A B;
  • .

For easier verification, you may need to order each set in alphabetical order. This will check Afor one Bin the (linear) scan using both sets of rows.

For small Aand large Bsets, it may be more efficient to search a string in Bwith a binary search rather than a linear scan; which also requires pre-sorting B.

+1
source

Source: https://habr.com/ru/post/1624773/


All Articles