Given an array of strings, we return all groups of strings that are anagrams

Given an array of strings, return all groups of strings that are anagrams.

My solutions:

For each string word in the array, sort it O (m lg m), m is the average length of the word.

Create a hash table <string, list>.

Put the sorted word in the hash table as a key, and also generate all permutations (O (m!)) Of the word, find each permutation in the dictionary (map of the prefix tree) with O (m), if it is in the dictionary, put (O ( 1)) in the hash table so that all permutation words are placed in the list with the same key.

In the general case, O (n * m * log m * m!) Time and O (n * m!) Space, n is the size of this array.

If m is very large, it is not effective, m !,

Any better solutions?

thanks

+6
source share
5 answers

We define an alphabet containing every letter that we can have in our list of words. Next, for each of the letters of the alphabet, we need a different prime number, I recommend using the smallest that you can find.

This will give us the following mapping: {a => 2, b => 3, c => 5, d => 7, etc.}

Now count the letters in the word that you want to represent as an integer, and build the integer of the result as follows:

pseudo code:

result = 1 for each letter: ....result *= power(prime[letter], count(letter,word) 

a few examples:

aaaa => 2 ^ 4

aabb => 2 ^ 2 * 3 ^ 2 = bbaa = baba = ...

etc.

That way, you will have an integer representing each word in your dictionary, and the word you want to test can be converted to an integer. Therefore, if n is the size of your vocabulary list, and k is the size of the longest word, then to create a new dictionary and O (k) you need O (nk) to check the new word.

Hackthissite.com has a programming problem, which is this: given the scrambled word, look in the dictionary to see if there are any anagrams in it. There is an effective solution to the problem with which I took the answer, and further optimization is also discussed in detail.

+10
source

use sorting sorting to sort a word so that sorting is done in O (m). after sorting, generate the key from the word and paste the node (key, value) into the hash table. The generating key can be reached in O (m).

You can take the value in (key, value) as some dynamic array that can contain more than one row. Each time you insert an existing key, simply click on the source word from which the key is generated in the array of values.

Thus, the total time complexity is O (mn), where n is the total number of words (input size).

Also this link has a solution to similar problems-> http://yourbitsandbytes.com/viewtopic.php?f=10&t=42

+2
source
 #include <map> #include <iostream> #include <set> #include <algorithm> int main () { std::string word; std::map<std::string, std::set<std::string>> anagrams; while(std::cin >> word) { std::string sortedWord(word); std::sort(sortedWord.begin(), sortedWord.end()); anagrams[sortedWord].insert(word); } for(auto& pair : anagrams) { for(auto& word : pair.second) { std::cout << word << " "; } std::cout << "\n"; } } 

I will let those who are better at analyzing a large conclusion than find out the difficulties.

+1
source

turn the dictionary into a match of the sorted characters of the word associated with each word of these characters, and save it. For each word you give, sort it and add the list of anagrams from the match to your output.

+1
source

I do not believe that you can do better in terms of O than

  • sorting the letters of each word
  • sorting a list of sorted words
  • each set of anagrams will now be grouped sequentially.
0
source

Source: https://habr.com/ru/post/903944/


All Articles