Worse, better. Is there an example?

Is there a widely used algorithm that has temporal complexity worse than another known algorithm, but is it the best choice in all practical situations ( worse , but better otherwise)?

A valid answer may be in the form of:

There are algorithms A and B that have O(N**2) and O(N) time complexity, respectively, but B has such a large constant that it has no advantage over A for inputs with fewer than the number of atoms in the Universe .

Examples from the answers:

  • Simplex algorithm - the worst case - exponential time - compared with the well-known polynomial time algorithms for convex optimization problems.

  • The naive median of the median algorithm is the worst case of O (N ** 2) compared to the well-known O (N) algorithm.

  • Backtracking regex engines are the worst exponential and O (N) Thompson based NFA engines.

All of these examples use the worst case and middle case scenarios.

Are there any examples that do not depend on the difference between the worst case scenario and the average case?




on this topic:

  • Raising "Worse is Better" (For the purposes of this question, the phrase "Worse is better" is used in a narrower (namely, algorithmic temporal) sense than in the article).

  • Python design philosophy :

    The ABC Group strives for excellence. For example, they used tree-based structural data algorithms that were proven to be optimal for asymptotically large collections (but were not very good for small collections).

    This example would be the answer if there were no computers capable of storing these large collections (in other words, the large size is small in this case).

  • The Coppersmith-Winograd algorithm for quadratic matrix multiplication is a good example (this is the fastest (2008), but it is inferior to the worst algorithms). Any other? From the Wikipedia article: “It is not used in practice because it provides an advantage only for such matrices that they cannot be processed with modern equipment (Robinson 2005).”

+47
algorithm time-complexity
Jan 23 '09 at 1:35
source share
24 answers

Coppersmith-Winograd algorithm for multiplying a square matrix. Its time complexity is O (n 2.376 ) compared to O (n 3 ) of the naive multiplication algorithm or versus O (n 2.807 ) for the Strassen algorithm .

From the wikipedia article:

However, unlike the Strassen algorithm, it is not used in practice because it gives only an advantage for matrices so large that they cannot be processed with modern equipment (Robinson 2005).

+12
02 Feb '09 at 12:03
source share

quick-sort has worse O (N ^ 2) time complexity, but is usually considered better than other sorting algorithms that have O (N log n) worst-case time complexity.

+35
Jan 23 '09 at 1:45
source share

Simplex is an algorithm that has exponential time complexity in the worst case, but for any real case, it is a polynomial. There are probably polynomial algorithms for linear programming, but they are very complex and usually have large constants.

+29
Jan 23 '09 at 1:49
source share

“Worse - better” can also be seen in languages, for example, the ideas of Perl, Python, Ruby, Php, even C # or Java, or any language that is not assembler or C (C ++ can fit here or not).

In principle, there is always an “ideal” solution, but many times it is better to use the “worst” tool / algorithm / language to get results faster and with less pain. That is why people use these languages ​​of a higher level, although they are "worse" from an ideal point of view in a computer language and are instead focused on a person.

+15
Jan 23 '09 at 2:09
source share

Monte Carlo integration is a probabilistic method for calculating certain integrals that does not guarantee the return of the correct answer. However, in real situations, it returns the exact answer much faster than the provably correct methods.

+14
Jan 23 '09 at 18:14
source share

This operator can be applied to almost any parallel algorithm. The reason that they were not thoroughly studied in the early days of the calculations is that for a single thread of execution (I think, a single processor) they are really slower than their known sequential copies in terms of asymptotic complexity, constant factors for small n, or both . However, in the context of modern and future computing platforms, an algorithm that can use several (think GPU), several hundred (think GPU), or several thousand (think supercomputers) processing elements will beat the pants of the serial version in wall clock mode, even if the total the time / energy spent on all processors is much more for the parallel version.

Varieties, graph algorithms and linear algebra methods can be accelerated in terms of wall clock time, incurring the costs of additional additional bookkeeping, communication and overhead at runtime to parallelize.

+10
Jan 23 '09 at 2:08
source share

Often an algorithm (for example, quicksort ) that can be easily parallelized or randomized will be selected by competing algorithms that do not have these qualities. In addition, it often happens that an approximate solution to the problem is acceptable when the exact algorithm gives exponential run times, as in Problem with the seller .

+9
Jan 23 '09 at 1:50
source share

This example would be the answer if there were no computers capable of storing these large collections.

Estimated collection size was 641K.




When we worked in the technical computing group for BAE SYSTEMS, which looked after the structural and aerodynamic codes for various aircraft, we had a code base that had returned for at least 25 years (and a third of the staff had been there for so long).

Many of the algorithms have been optimized for performance on a 16-bit mainframe, and not for scalability. These optimizations were completely appropriate for the 1970s hardware, but performed poorly on large datasets on 32-bit and 64-bit systems, which replaced it. If you choose something with worse scalability that works better on the hardware you are currently working on, keep in mind that this is an optimization and may not apply in the future. At the time these procedures were written in the 1970s, the size of the data we entered into them in the 2000s was impractical. Unfortunately, an attempt to extract a clear algorithm from those codes that can then be implemented in accordance with modern equipment was not trivial.

With the exception of boiling the oceans, what is considered “all practical situations” is often a time-dependent variable.

+8
Jan 23 '09 at 13:38
source share

Not entirely true, but countdown-based regular expressions have an exponential worst case against O (N) for DFA-based regular expressions, but backtracking-based regular expressions are almost always used, not DFA-based.

EDIT: (JFS)

Regular expression matching can be simple and fast (but slow in Java, Perl, PHP, Python, Ruby, ...)

The power that adds backlinks is expensive: in the worst case, most well-known implementations require exponential search algorithms.

Regular Expression Regulator :

This method (DFA) is indeed more efficient, and can even be adapted to allow capture and non-greedy matching , but it also has important disadvantages:

  • Workarounds are not possible.
  • Backlinks are also not possible.
  • Regex pre-assembly is longer and takes up more memory

On the bright side, as well as avoiding the exponential worst-case runtime, DFA approaches avoid using the worst-case stack, which is linear in size of the input data.

[3]:

+5
Jan 29 '09 at 4:37
source share

One example from computational geometry. Polygon triangulation has the O (N) algorithm in the worst case due to Chazelle , but it is almost never applied in practice due to the rigidity of the implementation and the huge constant.

+5
Nov 21 '10 at 20:59
source share

There is a polynomial time algorithm for determining primitiveness, but in practice it is always faster to use the exponential time algorithm or it is enough to calculate probabilistic calculations in order to have sufficient confidence.

+5
Feb 23 '11 at 16:28
source share

Radix sorting has O (n) time complexity for fixed-length inputs, but quicksort is more commonly used despite the worse asymptotic work environment, since the overhead of an Radix sorting element is usually much higher.

+4
Jan 23 '09 at 19:25
source share

Well, think about solving the sellers problem. The ideal solution ONLY is to check all possible routes. However, this becomes impossible with our equipment and time frames as N. increases. So, we thought about many heuristics.

This brings us to the answer to your question. Heuristics (worse) are better than brute force for NP-complete problems. This describes a situation where “Worse is Better” is always true.

+4
Jan 28 '09 at 3:07
source share

When calculating the median of a group of numbers, you can use an algorithm very similar to quicksort. You divide the number, and all the larger ones go in one direction, and all the smaller ones go in the other direction. Then you throw one side out and recursively compute the median big side. This takes O (n ^ 2) in the worst case, but pretty fast (O (n) with a low constant) in the average case.

You can get guaranteed O (n) performance in the worst case with a constant of about 40. This is called the median of the median algorithm . In practice, you will never use this.

+4
Jan 29 '09 at 2:40
source share

If I understand the question, you ask for algorithms that are theoretically better, but almost worse in all situations. Therefore, one should not expect that they will actually be used, if by mistake.

One possible example is universal memoization . Theoretically, all deterministic function calls should be remembered for all possible inputs. Thus, complex calculations can be replaced by simple table searches. For a wide range of problems, this method productively trades storage time. But suppose there is a central repository of the results of all possible source data for all possible functions used by all computers of mankind. The first time anyone made a calculation, this was the last time. All subsequent attempts will result in a table search.

But there are several reasons why I can think that I am not doing this:

  • The amount of memory required to store all the results is likely to be incredibly large. Probably the number of bits required will exceed the number of particles in the universe. (But even the task of estimating this number is complicated.)

  • It would be difficult to build an efficient algorithm to memoirize this huge problem space.

  • The cost of contacting the central repository is likely to exceed the benefits as the number of customers increases.

I am sure you can think of other issues.

In fact, this kind of compromise between time and space is incredibly widespread in practice. Ideally, all data will be stored in the L1 cache, but due to size limitations, you always need to put some data on disk or (horrors!) Tapes. Advancing technology reduces some of the pain of these trade-offs, but as I suggested above, there are limitations.




In response to a comment by J. F. Sebastian:

Suppose that instead of a universal memoization repository, we consider a factorial repository. And it will not contain results for all possible inputs. Rather, it will be limited to results from 1 to N! . Now it’s easy to understand that any computer that used factorials would benefit from finding the result, and not for computing. Even to calculate (N+1)! the search will be a huge win, as this calculation will be reduced to N!(N+1) .

Now, to make this “better” algorithm worse, we could either increase N or increase the number of computers using the repository.

But I probably do not understand some of the subtleties of the question. They think so about it, I continue to come up with examples that scale well until they do.

+4
Jan 29 '09 at 19:47
source share

Mergesort and Quicksort

Quicksort has an average time complexity of O (n log n). It can sort arrays in place, i.e. Spatial complexity O (1).

Merge sorting also has an average time complexity of O (n log n), however its spatial complexity is much worse: Θ (n). (there is a special case for linked lists)

In the worst case, the temporary complexity of quicksort is Θ (n ^ 2) (i.e., all elements fall on the same side of each bar), and in the worst case, mergesort is O (n log n), mergesort is the value of The default choice for library developers.

In this case, I think that the predictability of the complex complexity of the fusion time with the worst trump case quickly reduces memory requirements.

Given that it is possible to significantly reduce the likelihood of a worse case of the time complexity of quick sorting (using a random axis selection, for example), I think it would be possible to say that merge sorting is the worst, but a pathological case of quick sorting.

+3
Jan 28 '09 at 12:33
source share

I have always understood that the term “worse is better” refers to problems with the right solutions, which are very complex when there is an approximate (or good enough) solution that is relatively easier to understand.

This simplifies design, production and maintenance.

+2
Jan 23 '09 at 2:19
source share

There is an O (n) algorithm for selecting the kth largest element from an unsorted set, but it is rarely used instead of sorting, which, of course, is O (n logn).

+2
Jan 23 '09 at 8:59
source share

Sorting insert, despite the fact that O (n 2 ) complexity is faster for small sets (n <10) than any other sorting algorithm. This is because the nested loop is small and runs fast. Many libraries (including STLs) that implement the sorting method actually use it for small subsets of data to speed up the process.

+2
Jan 29 '09 at 2:31
source share

Integration in Monte Carlo has already been proposed, but a more specific example is that Monte Carlo pricing in finance is also an offer. Here, the method is much easier to code and can do more things than some others, but it is much slower than, say, the final difference.

it is not practical to make 20-dimensional algorithms with a finite difference, but 20-dimensional pricing is easily configurable.

+1
Jan 29 '09 at 5:02
source share

There are many examples.

For example:

  1. Y-fast-trie has a difficult logging time for successor / predecessor, but it has large constants, so BST (i.e. logn) is better.

  2. There is an O (1) algorithm in the worst case, to find the longest common prefix of two registers, but it has a huge constant, so it is always better to use the trivial logu algorithm (where u is the maximum value of the register). Even if you are the number of atoms in the observable universe, it is still probably best to use a solution to the log.

  3. Same as 2, but for searching MSB register.

  4. The Fusion tree has the complexity of querying O (logn / loglogu), but with HUGE constants (constants that are much larger than in all the previous examples), and BST can achieve the same in logn. So BST is ALWAYS better (unless you have an infinite amount of data, which is impossible).

+1
Sep 13 '19 at 20:20
source share

Iterative Deepening

Compared to the trivial depth search, complemented by alpha-beta pruning, an iterative recess , used in conjunction with a poor (or nonexistent) branch ordering heuristic, will cause many more nodes to be scanned. However, when effective heuristics are used to arrange branches, a significant part of the tree is eliminated due to the increased alpha-beta pruning effect. The second advantage, not related to temporal or spatial complexity, is that the guess on the solution to the problem area is established earlier, and this guess is refined as the search query progresses. It is this second advantage that makes it so attractive in many areas of concern.

0
31 '10 4:16
source share
 Quick-sort has worst case time complexity of O(N^2)! It is considered better than other sorting algorithms like mergesort heapsort etc. which have O(N log n) time complexity in the worst case. The reason may be the 1.in place sorting 2.stability, 3.very less amount of code involved. 
0
12 . '13 17:17
source share



All Articles