Suffix tree and attempts. What is the difference?

Question

Suffix tree and attempts. What is the difference?

I read about Tries , commonly called prefix trees, and Suffix Trees .
Although I found the code for Trie , I cannot find an example for Suffix Tree . There is also a feeling that the code that Trie creates matches the code for Suffix Tree with the only difference being that in the first case we keep the prefixes, but in the last suffixes.
It's true? Can someone help me clarify this in my head? Sample code will be a big help!

+44

algorithm data-structures suffix-tree trie

Cratylus Dec 15 '12 at 16:23

source share

3 answers

If you fancy Trie in which you put some suffixes, you can easily request it for string substrings. This is the main idea of the suffix tree, basically it is the "suffix three".

But using this naive approach, building this tree for a string of size n would be O (n ^ 2) and take up a lot of memory.

Since all entries in this tree are suffixes of the same line, they share a lot of information, so there are optimized algorithms that allow you to create them more efficiently. For example, the Ukkonen algorithm allows you to create a suffix tree online (O (n)).

+4

Juan Lopes Dec 15 '12 at 17:18

source share

The difference is very simple. The suffix tree has fewer "dummy" nodes than the trie suffix. These dummy nodes are single characters that enhance the search operation in the tree.

0

curious Jan 11 '16 at 22:24

source share

Ze Blob · Accepted Answer · 2012-12-15 16:52

The suffix tree can be thought of as a data structure built on top of a trie vertex, where instead of simply adding a line to a trie, you also add all the possible suffixes of that line. For example, if you want to index a banana string in a suffix tree, you should build a trie with the following lines:

 banana anana nana ana na a

After that, you can search for any n-gram and see if it is present in your indexed row. In other words, an n-gram search is a prefix search for all possible suffixes of your string.

This is the easiest and slowest way to create a suffix tree. It turns out that there are many more favorable options in this data structure that improve one or both of the spaces and build time. I do not know enough about this domain to give an overview, but you can start by exploring arrays of suffixes or extended data structures in this class (lectures 16 and 18).

This answer also does a wonderful job explaining a variant of this data structure.

Suffix tree and attempts. What is the difference?

More articles: