How to handle Big O when one variable is known to be smaller than another?

Question

How to handle Big O when one variable is known to be smaller than another?

We have 4 algorithms, all of which are complex depending on m and n , for example:

Alg1: O(m+n) Alg2: O(mlogm + nlogn) Alg3: O(mlogn + nlogm) Alg4: O(m+n!) (Oh, that sucks, but whatever)

Now, how do we deal with this, if now we are n>m ? My first thought: the Big O note “discards” constant and smaller variables, because it does not matter when, but sooner or later, the “larger term” will overload everyone else, making them inconsequential in the cost of the calculations.

So, can we rewrite Alg1 as O(n) , or Alg2 as O(mlogm) ? If so, what about others?

+5

variables algorithm big-o

justHelloWorld Oct 23 '17 at 17:50

source share

2 answers

Consider the problem of finding the k largest elements of an array. There is a good O (n log k) -time algorithm for solving this issue, using only O (k) space, which works by supporting a binary heap of at most k elements. In this case, since k is guaranteed no more than n, we could rewrite these boundaries as O (n log n) with memory O (n), but this turned out to be less accurate. The direct dependence of the runtime and memory usage on k makes it clear that this algorithm takes more time and uses more memory when k changes.

Similarly, consider many standard graph algorithms, such as Dijkstra's algorithm. If you use the Dijkstra algorithm using the Fibonacci heap, the runtime runs O (m + n log n), where m is the number of nodes and n is the number of edges. If you assume that your input graph is connected, then the runtime is also O (m + m log m), but this is a less accurate border than what we had.

On the other hand, if you implement the Dijkstra binary heap algorithm, then the runtime runs in O (m log n + n log n). In this case (again, assuming the graph is connected), the m log n-term strictly dominates the n log n-term, so rewriting this as O (m log n) does not lose any accuracy.

Generally speaking, you will want to give the most accurate boundaries that you can in the process of documenting the runtime and memory usage of a code fragment. If you have several terms where one clearly clearly dominates the other, you can safely abandon these lower terms. But otherwise, I would not recommend abandoning one of the variables, since this loses accuracy in the task you are related to.

+3

templatetypedef Oct 23 '17 at 18:04

source share

TNguyen · Accepted Answer · 2017-10-23T17:58:03+0000

yes, you can rewrite it if you know that it is always the case that n> m. Take a formal look at this

if we know that n> m (always), then it follows that

O(m+n) < O(n+n) , which is equal to O(2n) = O(n) (we don't care about 2)

the same can be said about other algorithms

O(mlogm + nlogn) < O(nlogn + nlogn) = O(2nlogn) = O(nlogn)

I think you can see where the rest are going. But if you do not know that n> m, you cannot say above.

EDIT: as @rici has well noted, you also need to be careful, as it will always depend on the given function. (Note that O((2n)!) Cannot be simplified to O(n!) )

With a few games you can see how this is not so.

(2n)! = (2n) * (2n-1) * (2n-2)... < 2(n) * 2(n-1) * 2(n-2) ...

=> (2n)! = (2n) * (2n-1) * (2n-2)... < 2^n * n! (2n)! = (2n) * (2n-1) * (2n-2)... < 2^n * n! (After combining all two factors)

So we can see that O((2n)!) more like O(2^n * n!) , To get a more accurate calculation, you can see this topic here. Are there two difficulties O ((2n + 1 )!) and O (n!) are equal?

How to handle Big O when one variable is known to be smaller than another?

More articles: