Why not use heap sorting always

Sorting Sorting sorting seems to have the worst O (nlogn) complexity and uses O (1) space for the sort operation.

This seems better than most sorting algorithms. Then why not use Heap Sort always as a sorting algorithm (and why do people use sorting mechanisms like Merge or Quick sort)?

In addition, I have seen people use the term "instability" with heap sorting. What does it mean?

+48
sorting algorithm heapsort
Nov 29 '11 at 12:57
source share
5 answers

Stable sorting maintains the relative order of elements that have the same key. For example, imagine that your data set contains records with an employee ID and name. Initial order:

1, Jim 2, George 3, Jim 4, Sally 5, George 

You want to sort by name. Stable sorting will arrange items in the following order:

 2, George 5, George 1, Jim 3, Jim 4, Sally 

Note that duplicate entries for George are in the same relative order as in the original list. Same thing with two Jim records.

Unstable sorting can arrange items as follows:

 5, George 2, George 1, Jim 3, Jim 4, Sally 

Heapsort is unstable because heap operations can change the relative order of equal elements. Not all Quicksort implementations are stable. It depends on how you implement the separation.

Even though Heapsort has worse O(n log(n)) complexity, this does not tell the whole story. In a real implementation, there are constant factors that theoretical analysis does not take into account. In the case of Heapsort vs. Quicksort finds out that there are ways (like median 5) to make the worst cases of Quicksort very rare. In addition, saving heaps is not free.

Given an array with a normal distribution, Quicksort and Heapsort will work in O(n log(n)) . But Quicksort will run faster because its constant factors are less than the constant factors for Heapsort. Simply put, splitting is faster than saving the heap.

+88
Nov 29 '11 at 2:09 p.m.
source share

The heap configuration has the worst complexity O(n log(n)) . Nevertheless, empirical studies show that usually Quick Sort (and other sorting algorithms) is much faster than a sorting heap, although its worst complexity is O(n²) : http://www.cs.auckland.ac.nz/~jmor159/ PLDS210 / qsort3.html

Also, from a quick sort article on Wikipedia:

The quicksort's most direct competitor is heapsort. The worst case Heapsort runtime is always O (n log n). But it is assumed that heapsort will be on average slightly slower than standard quicksort. This is still being discussed in research, with some publications suggesting otherwise. [13] [14] Introsort is a variant of quicksort that switches to heapsort when a bad case is detected in order to avoid the worst running time of quicksort. If you know in advance that you will need to use heapsort, using it directly will be faster than waiting for introsort to switch to it.

However, the quick view should never be used in applications requiring a response time guarantee!

Stackoverflow source: Quicksort vs heapsort

+8
Nov 29 '11 at 13:06
source share

No silver bullet ...

Just mention another argument that I have not seen here:

If your data set is really huge and does not fit into memory, then merging sorting works like a charm. It is often used in clusters where a dataset can span hundreds of machines.

+6
Nov 29 '11 at 13:25
source share

Stable sorting algorithms maintain relative order of records with equal keys

Some applications, such as the availability of such stability, in most cases do not care, for example, Google is your friend.

As you say that "people use sorting mechanisms, such as Merge or Quick sort," I would bet that most people use everything that is built into their language and don’t think much about the sorting algorithm. Those who their own probably have not heard of sorting the heap (the latter is personal experience).

The last and biggest reason is that not everyone will want to sort a bunch. Some people need a sorted list. If the average boss of the programmer Joe says “sort this list” and Joe says: “This is the heap data structure you have never heard of, boss!”, The next performance review by Joe will not be that big.

0
Nov 29 '11 at 13:41
source share

When I worked for a short time on Tandem Non-Stop computers in the mid-80s, I noticed that the system sorting procedure in the kernel was HeapSort, precisely because it provided guaranteed NlogN performance. I don’t know anyone who has any reason to use it, so I don’t know how this works in practice. I like heapsort, but, like the shortcomings noted above, I heard that it says that it makes poor use of modern memories, because it makes access to memory everywhere, while sorting quick sort and even a small radius ends mixing a relatively small number of sequential read and write streams is why caches are more efficient.

0
Nov 29 '11 at 19:02
source share



All Articles