What is the execution complexity (large O) of the next pseudocode?

Recently, I had a very, very intense discussion about the complexity of executing a super simple algorithm with my colleague. In the end, we both agreed to disagree, but as I thought about it, it challenged my basic understanding of the basics of computer science, and so I therefore need to get more information on this.

Given the following python, what is the difficulty of executing Big-O:

for c in "How are you today?": print c 

Now I immediately called that it was just of the order of O (n) aka linear. It depends on the length of the string, so this loop will grow linearly as the length of the string increases.

Then my colleague said: “No, it is permanent, because we know that for the set of all the lines that we deal with (in our case), the maximum line always has a length of 255 characters (in our case), so it should be permanent. " He further said: "Since we have a maximum upper bound on the length of the string character, this results in O (255), which boils down to O (1)."

In any case, we returned and the fourth, and after 45 minutes we both drew the sketches, which we both locked in this matter.

My question is, in which world or in which mathematical system is the cycle higher than the cycle with constant time? If we knew that our upper bound corresponds to 1,000,000 characters, and the set of all lines can be anywhere from 0 to 1,000,000, this cycle will obviously show linear runtime depending on the size of the line.

I also asked him if he thinks the following code is O (1) if the size n above is known. We are sure that this code will work only with the maximum upper bound, for example, 255 characters:

 s = "How are you today?" for c in s: for d in s: print c+d 

He said that it is also a constant time .... even after I explained that it is an O (n ^ 2) algorithm and demonstrated that the following code will create a quadratic curve.

So, I missed some theoretical concept, where any of the above is true depending on how the theory goes? To be clear, his understanding is that I am right if n is unknown. If the upper bound of n is always known, he claims that both algorithms in this post are like constant execution complexity.

I just want to keep my sanity, but maybe, if I'm wrong, then, of course, you can get additional training. My good, good colleague was very convincing. In addition, if someone has additional links or materials on a subject related to this issue, add to the comments.

+5
source share
4 answers

Applying Big-O notation to one scenario in which all inputs are known is ridiculous. For one case, there is no Big-O.

The thing is to get the worst case estimate for arbitrarily large, unknown values ​​of n . If you already know the exact answer, why on earth would you spend time trying to appreciate it?

Mathy / Computer-Sciencey Edit:

The designation Big-O is defined as n grows arbitrarily large: f (n) - O (g (n)) if g (n) & ge; c * f (n), for any constant c, for all n greater than some nMin . Meaning, your “opponent” can set c to “eleventh quad-cillion,” and it doesn’t matter, because for all points “right” of some point nMin, the graph “eleven-four-million times f (n)” will lag behind g ( n) ... forever.

Example: 2 n is less than or equal to n 2 ... for a short segment of the x axis, which includes n = 2, 3, and 4 (for n = 3, 2 n is 8, and n 2 is 9). This does not change the fact that their Big-O ratio is opposite: O (2 n ) is much larger than O (n 2 ), since Big-O says nothing about n values ​​less than nMin. If you set nMin to 4 (thus ignoring the graph to the left of 4), you will see that line n 2 never exceeds line 2 n .

If your “adversary” multiplies n 2 by some large constant c to raise “his” line n 2 above your line 2 n you have not lost ... you just lower nMin a little to the right. Big-O says that no matter how big it makes c, you can always find the point after which its equation loses, and yours wins forever.

But , if you restrict n to the right, you have violated the premise for any kind of Big-O analysis. In your dispute with your employee, one of you came up with nMax, and then another set of nMin somewhere to his right --- surprise, the results are meaningless.

For example, the first algorithm you showed really does n work for inputs of length n ... in the general case. If I were building my own algorithm that called it n times, I would have to consider my quadratic algorithm O (n 2 ) ... again in the general case.

But if I could prove that I would never name your algorithm more than 10 (this means I had more information and I could more accurately evaluate its algorithm), using Big-O to evaluate the effectiveness of your algorithm discarding what I learned about his actual behavior in the case that I care about. Instead, replace your algorithm with a sufficiently large constant --- change your algorithm from c * n 2 to c * 10 * n ..., which is just cBigger * n. I could honestly say that my algorithm is linear, because in this case, the graph of the algorithm will never rise above this constant value. This would not change anything regarding the Big-O performance of your algorithm, because Big-O is not defined for such cases.

Finish: In general, this first algorithm that you showed was linear by Big-O standards. In the limited case, when the maximum input is known, it is erroneous to talk about it in terms of Big-O in general. In a limited case, it could be legally replaced with some constant value when discussing the Big-O behavior of some other algorithm, but this absolutely does not say anything about the Big-O behavior of the first algorithm.

In conclusion: O ( Ackermann (n)) works fine when nMax is small enough. Very, very few ...

+11
source

In your case ...

I am tempted to say that your friend is mildly mistaken. And this is due to the significantly large additional constant 256 in O (1) runtime. Your friend said the execution was O (256). And since we ignore constants in Big-O, we simply call O (256 * 1) as O (1). It is up to you to decide whether this constant is null and void for you or not.


I have two good reasons to say that you are right:

First, for different values ​​of n, your answer O (n) (in the first code) gives a better approximation of the runtime. For instance:

  • For a string of length 4: you say that the runtime is proportional to 4, and your friend says that it is proportional to 1 (or 256).
  • For a string of length 255: you say that the runtime is proportional to 255, and your friend says again that this is a constant time.

It is clear that your answer is more accurate in each case , although his answer is not entirely wrong.

Secondly, if you follow the method of friends, then in one sense you can deceive and say that, since no line can go beyond the limits of your RAM + disk, therefore all processing is performed in O (1). And this is when your friend’s mistake becomes apparent. Yes, he is right that the operating time (provided that there is 1 TB and 8 GB of RAM on the hard disk) is O ((1TB + 8GB) * 1) = O (1), but you just cannot ignore the size of your constant in this scenario.


The complexity of Big-O does not indicate the actual runtime, but simply a simplified rate of increase in runtime as n increases.

+2
source

I think both of you are right.

The execution time of the first algorithm is linear in size of its input. However, if its input is fixed, then its execution time is also fixed.

Big O is all about measuring the behavior of an algorithm as its input changes. If the input never changes, then Big O makes no sense.

Also: O (n) means that the upper bound of complexity is N. If you want to represent a tight binding , then a more accurate notation will be Θ (n) (theta notation).

+1
source

You are both fine, but you're right than your colleague. ( EDIT: No. If you think twice, you are right and your colleague is wrong. See my comment below.) The question is whether N is known but whether N can change . Is s input to your algorithm? Then O (N) or O (N ^ 2): you know the value of N for this particular input, but the other input will have a different value, so knowing N for this input does not matter.

Here is the difference in your two approaches. You process this code as if it looked like this:

 def f(s): for c in s: print c f("How are you today?") 

But your colleague views this as follows:

 def f(some_other_input): for c in "How are you today?": print c f("A different string") 

In the latter case, that for the cycle should be considered O (1), because it will not change with different inputs. In the first case, the algorithm is O (N).

+1
source

Source: https://habr.com/ru/post/1232335/


All Articles