Big O Notation to match algo strings

What would be the big notation of the foo function?

int foo(char *s1, char *s2) { int c=0, s, p, found; for (s=0; s1[s] != '\0'; s++) { for (p=0, found=0; s2[p] != '\0'; p++) { if (s2[p] == s1[s]) { found = 1; break; } } if (!found) c++; } return c; } 

What is the efficiency of the foo function?

a) O (n!)

b) O (n ^ 2)

c) O (n lg (base2) n)

d) O (n)

I would say O (MN) ...?

+6
source share
4 answers

This is O(n²) , where n = max (length (s1), length (s2)) (which can be determined in less than quadratic time - see below). Let's take a look at the definition of a tutorial:

f (n) ∈ O (g (n)) if there exists a positive real number c and a positive integer N such that f (n) <= cg (n) for all n> = N

In this definition, we see that n represents a number - in this case, this number is the length of the transmitted string. However, there is a clear discrepancy, since this definition provides only one variable function f(n) and here we explicitly go through 2 lines with an independent length. So, we are looking for a multi-parameter definition for Big O. However, as demonstrated by Howell in "On Asymptotic Multiple Variable Notation" :

"it is impossible to define a multi-valued notation for functions with several variables in such a way that it implies all these [generally accepted] properties.

In fact, there is a formal definition of Big O with several variables , but this requires additional restrictions outside the scope of one Big O variable and goes beyond the majority (if not all) of the algorithms. For a typical analysis of algorithms, we can effectively reduce our function to a single variable, restricting all variables to the limiting variable n . In this case, the variables (in particular, the length (s1) and the length (s2)) are clearly independent, but they can be related:

Method 1

 Let x1 = length(s1) Let x2 = length(s2) 

The worst case scenario for this function occurs when there are no matches, so we iterate through x1 * x2.

Since multiplication is commutative, the worst case scenario is foo (s1, s2) == the worst case scenario is foo (s2, s1). Therefore, we can assume without loss of generality that x1> = x2. (This is because if x1 <x2, we could get the same result by passing the arguments in reverse order).

Method 2 (if you do not like the first method)

In the case of the worst case scenario (in which s1 and s2 do not contain common characters), we can determine the length (s1) and length (s2) before iterating through the loops (in .NET and Java, determining the length of the string - O (1) - but in in this case, it is O (n)), assigning a larger value of x1 and a smaller x2. Here it is clear that x1> = x2.

In this case, we will see that additional calculations to determine x1 and x2 do this O (n² + 2n). We use the following simplification rule which can be found here to simplify to O (n²):

If f (x) is the sum of several members, then with the highest growth rate it is preserved, and all the others are omitted.

Conclusion

for n = x1 (our limit variable), such that x1 >= x2 , the worst case scenario is x1 = x2 . Therefore: f(x1) ∈ O(n²)

Additional Tips

For all SO related homework related to Big O notes , if the answer is not one of:

 O(1) O(log log n) O(log n) O(n^c), 0<c<1 O(n) O(n log n) = O(log n!) O(n^2) O(n^c) O(c^n) O(n!) 

Then the question is probably best left at https://math.stackexchange.com/

+6
source

In the notation with large O, we must always determine what the arising variables mean. O(n) means nothing unless we define what n . Often we can omit this information because it is clear from the context. For example, if we say that some sorting algorithm is O(n log(n)) , n always indicates the number of elements to sort, so we don’t always have to indicate this.

Another important thing in the Big-O note is that it only gives an upper limit - each algorithm in O(n) also in O(n^2) . The notation is often used as meaning “the algorithm has an exact asymptotic complexity defined by the expression (accurate to a constant factor),” but this is the actual definition of “alogrim complexity is limited by this expression (accurate to a constant factor).”

In the example you indicated, you took m and n as the corresponding lengths of the two lines. With this definition, the algorithm is really O(mn) . If we define n as the length of the length of two lines, we can also write it as O(n^2) - this is also the upper limit of the complexity of the algorithm. And with the same definition of n , the algorithm is also O(n!) , But not O(n) or O(n log(n)) .

+5
source

O (n ^ 2)

The corresponding part of the function in terms of complexity is nested loops. The maximum number of iterations is the length s1 times the length s2, both of which are linear factors, so the worst computing time is O (n ^ 2), i.e. The squared linear factor. As Ethan said, O (mn) and O (n ^ 2) are virtually the same.

+2
source

Think of it this way:

There are two entrances. If the function just returned, then its performance is not related to the arguments. This will be O (1).

If the function loops on one line, then the performance is linearly related to the length of this line. Therefore, O (N).

But the function has a loop inside the loop. Performance is related to length s1 and length S2. Multiply these lengths together and you get the number of iterations of the loop. It is no longer linear, it follows a curve. This is O (N ^ 2).

+2
source

Source: https://habr.com/ru/post/890681/


All Articles