Expected runtime of hybrid sorting and sorting

I myself study the third edition of CLRS, and here is one of the most difficult questions that I have encountered along with my answer as a service for everyone.

7.4-5 We can improve the quick sorting run time in practice by taking advantage of the quick insertion sorting time when its input is β€œalmost” sorted. After calling quicksort on a subarray with less than k elements, let it just return without sorting the subarray. After calling the top level to return quicksort, start insertion sort on the entire array to complete the sorting process. It is claimed that this sorting algorithm works in O(nk+nlg(n/k)) expected time. How do we choose k , both in theory and in practice?

+6
source share
3 answers

If you equate the equation nlgn > nk + nlog(n/k) , you get log k > k . It's impossible. Therefore, in theory this is not possible. But in practice, there are constant factors associated with insertion sorting and quick sorting. Take a look at the solution discussed in this pdf . You might want to update your answer.,

+5
source

Actually, the answer is k = 8 .

The algorithm you get is the composition of two anonymous functions, one of which is O(nk) , and the other is O(n lg n/k) . These anonymous functions hide average constants. Insertion is sorted at n^2/4 times in the average case, and randomized performance is performed at 1.386 n lg n in the average case. We want to find a value of k that minimizes the value of nk/4 + 1.386( n lg n/k ) equal to nk/4 + 1.386 n lg n - 1.386 n lg k . This means finding the maximum of the function 1.386 lg k - k/4 . This maximum value has a value of k = 8 .

+2
source

A leaf has an equal probability of being 1 to k in size.
Thus, the expected sheet size is k/2 .
If the expected leaf size is k/2 , then we expect n/(k/2)=(2n)/k such leaves.
For simplicity, suppose we expect n/k such leaves and that the expected size of each leaf is k .
The expected INSERTION-SORT is O(n^2) .
We found that in exercise 5.2-5 and problem 2-4c.
Thus, the expected operating time of INSERTION-SORT is O((n/k)*(k^2))=O(nk) .
If we expect our partition groups to be of size k , then it is expected that the height of the recursion tree will be lgn-lgk=lg(n/k) , since we expect that we will stop at lgk .
At each level of the recursion tree, there are O(n) operations.
This brings us to O(nlg(n/k)) .
We conclude that the expected runtime is O(nk+nlg(n/k)) .

In theory, we should choose k=lgn , since in this way we get the best expected run time O(nlgn) .

In practice, we should choose k for one of the values ​​near lgn , which gives us better performance than just running RANDOMIZED-QUICKSORT .

The second part of the answer uses the note of large o-output rather freely, so for a more accurate choice of k , please follow the link provided in the second answer of Ankush.

0
source

Source: https://habr.com/ru/post/909982/


All Articles