How to find words in a table?

This is a coding exercise. Suppose there is a table of letters and a few words. I have to find the position of the words in the table. A word can begin anywhere in the table and can be oriented either vertically horizontally. (We can assume that a row / column can contain only one word).

For instance:

  table = xabcx
         xxxdx
         xxfex

 words = ["abc", "edc", "fe"]

 expected output is (0,1), (2,3), (2,2)

A simple solution is to iterate over all the rows / columns and check if each row / column contains any words. Requires O(number of columns * number of rows * number of words * word length) . Is there a better solution? Maybe I should pre-process the word list in order to build a more efficient data structure?

+4
source share
4 answers

You can use the Trie data structure to store the table. Finding words is very simple if you have Trie.

+3
source

I would suggest a binary tree structure for a table. Basically, this is what most major relational database systems use. In this case, you can balance the tree based on the whole hash code created from this word. Then, when searching, create a hash from the search query and wisely go through your tree until you find a line that matches.

+1
source

Here is a simple approach.

You are only looking for exact matches, so I would think that you should immediately consider hash-based algorithms, not tree-based ones. First, consider a hash map that associates each letter of the alphabet with positions in the table. Now for each word you look at the first letter, and then cross the table (left, right, up, down) to see if the whole word exists.

You can improve this by creating a hash map instead for each combination of two letters (676 keys in total) for each direction (left, right, up, down). Now you start by checking the first two letters of your word, and the hash map immediately gives you places where these two letters exist. Now you can continue to look in the table in this direction to find out if this word is completed. Alternatively, you can take the next two letters of the word and see if there is a place for this pair of letters that is adjacent to the first letter and has the same direction.

You can improve it further ... by looking at the hash map for every three letter combinations for each direction. You should be able to find a good balance between storage requirements and heuristic-based performance, such as average word length.

+1
source

Using the string matching algorithm, Aho-Corasick in each column / row only accepts O ( number of columns * number of rows + length of patterns + number of output matches ).

+1
source

Source: https://habr.com/ru/post/1391144/


All Articles