How is the worst time complexity of constructing a suffix tree linear?

It’s hard for me to understand how the worst temporal complexity of building a suffix tree is linear — especially when we need to build a suffix tree for a string, which can consist of a repeating single character , such as “aaaaa”.

Even if I were to create a compressed suffix tree for "aaaaa", I will not be able to compress any nodes, because no two edges starting with node can have string labels starting with the same character.

This will lead to a 5 suffix tree, and with each suffix insertion, I will need to continue to move from root to leaf.

Here's how I came up: suffixes: a, aa, aaa, aaaa, aaaaa

Create the root of the node, create an edge carrying "a", and connect it to the new node, where "$" is on the left, and repeat this process until we can aaaaa.

This will result in O (n ^ 2) instead of O (n). What am I missing here?

+4
source share
1 answer

I agree with the comments, but here are a few more details:

The procedure that you describe to form a tree aaaaais O (n), not O (n ^ 2). Node and sheet creation are constant-time operations, and you execute them n == 5 times. Your assumption O (n ^ 2) seems to be based on the idea that you go from root to leaf at every step, but there is no need to do this; for example, in the Ukkonen algorithm:

  • Node, ,
  • - , $, ,

, ( ) O (n), , "" , , (a) , , (b) . , node , O (n) .

+2

Source: https://habr.com/ru/post/1599414/


All Articles