Sorting and time complexity

I "invented" the "new" sorting algorithm. Well, I understand that I can’t come up with something good, so I tried to find it on Wikipedia, but all the sorting algorithms seem to be not mine. Therefore, I have three questions:

  • What is the name of this algorithm?
  • Why does this suck? (best, medium and worst time complexity).
  • Can I use this idea even better?

So, the idea of ​​my algorithm is: if we have an array, we can count the number of sorted elements, and if this number is less than half the length, we can change the array to make it more sorted. And after that we can sort the first half and the second half of the array. In the best case, we only need O (n) - if the array is completely sorted in good or bad direction. I have some problems evaluating the average and worst time complexity.

C # code:

public static void Reverse(int[] array, int begin, int end) { int length = end - begin; for (int i = 0; i < length / 2; i++) Algorithms.Swap(ref array[begin+i], ref array[begin + length - i - 1]); } public static bool ReverseIf(int[] array, int begin, int end) { int countSorted = 1; for (int i = begin + 1; i < end; i++) if (array[i - 1] <= array[i]) countSorted++; int length = end - begin; if (countSorted <= length/2) Reverse(array, begin, end); if (countSorted == 1 || countSorted == (end - begin)) return true; else return false; } public static void ReverseSort(int[] array, int begin, int end) { if (begin == end || begin == end + 1) return; // if we use if-operator (not while), then array {2,3,1} transforms in array {2,1,3} and algorithm stop while (!ReverseIf(array, begin, end)) { int pivot = begin + (end - begin) / 2; ReverseSort(array, begin, pivot + 1); ReverseSort(array, pivot, end); } } public static void ReverseSort(int[] array) { ReverseSort(array, 0, array.Length); } 

PS: Sorry for my English.

+4
source share
2 answers

The best case is Theta (n), for example, for a sorted array. In the worst case, Theta (n ^ 2 log n).

Upper bound

Secondary subtasks have a sorted array that is preceded or followed by an arbitrary element. This is O (n log n). If we do O (n) before, solve the secondary subtask in the first half, and then in the second half, and then do O (n) more work - O (n log n). If this succeeds, perform operation O (n), sort the already sorted first half (O (n)), enable the second subtask in the second half, execute O (n), enable the secondary subtask in the first half, sort the already sorted second half ( O (n)), do O (n) work - O (n log n).

Now, in the general case, we solve two main subtasks in two halves, and then slowly exchange elements along the rotation axis using secondary calls. O (n) exchanges are required, therefore, a direct application of Mat. The theorem gives an estimate of O (n ^ 2 log n).

Bottom line

For k> = 3, we construct an array A (k) of size 2 ^ k, recursive, using the above analysis as a guide. Bad cases are the arrays [2 ^ k + 1] + A (k).

Let A (3) = [1, ..., 8]. This sorted base register does not allow Reverse be called.

For k> 3, let A (k) = [2 ^ (k-1) + A (k-1) [1], ..., 2 ^ (k-1) + A (k-1) [2 ^ (k-1)]] + A (k-1). Note that the primary subproblems [2 ^ k + 1] + A (k) are equivalent to [2 ^ (k-1) + 1] + A (k-1).

After the initial recursive calls, the array is [2 ^ (k-1) + 1, ..., 2 ^ k, 1, ..., 2 ^ (k-1), 2 ^ k + 1]. There are Omega (2 ^ k) elements that must move Omega (2 ^ k) positions, and each of the secondary invocations that moves the element so far has O (1) sorted subtasks and therefore Omega (n log n) .


Obviously, more coffee is needed - the primary subtasks are irrelevant. This is not so bad for analyzing the average case , which is Theta (n ^ 2 log n) .

With constant probability, the first half of the array contains at least half the smallest quartile and at least half the largest quartile. In this case, regardless of whether Reverse exists, there are Omega (n) elements that must move Omega (n) positions through secondary calls.

+3
source

It seems that this algorithm, even if it works terribly with “random” data (as Per shows in their answer), is efficient enough to “fix” arrays that are “almost sorted”. Thus, if you decided to develop this idea further (I personally would not do this, but if you would like to think of it as an exercise), you should focus on this force.

this Wikipedia link in the article "Inversion" refers very well to this problem. Mahmoud’s book is quite insightful, noting that there are various ways to measure “sorting”. For example, if we use the number of inversions to characterize an “almost sorted array”, then we can use insertion sort to sort it very quickly. However, if your arrays are "almost sorted" in slightly different ways (for example, a deck of cards that are cut or shuffled freely), then inserting a sort will not be the best sort to "fix" the list.

  • Input: an array that has already been sorted by size N, with approximately N / k inversions.

I could do something similar for the algorithm:

  • Calculate the number of inversions. ( O(N lg(lg(N))) , or can be considered small and skip a step)
    • If the number of inversions is <[threshold], sort the array using insertion sort (it will be fast).
    • Otherwise, the array is not close to sorting; resort to using your favorite comparison sorting algorithm (or better)

There are better ways to do this; you can "fix" such an array, at least O(log(N)*(# new elements)) , if you pre-process your array or use the correct data structure, for example, an array with properties of linked lists or similar, which supports binary search .

You can generalize this idea even further. Whether the "fix" of the array will work depends on the type of fix that is required. Thus, if you update these statistics whenever you add an item to the list or change it, you can send a good “fix” algorithm.

But unfortunately, that would be a pain for the code. You can just get rid of need - this is a priority line .

+1
source

Source: https://habr.com/ru/post/1383292/


All Articles