Sort strings using merge sort

Question

Sort strings using merge sort

What is the worst difficulty to sort n strings with n characters? Will it be just n times bigger. case O(n log n) or something else ...?

+6

string sorting algorithm mergesort

Abhishek Feb 26 '12 at 0:29

source share

3 answers

When you talk about O notation with two things of different lengths, usually you want to use different variables, like M and N

So, if your merge sort is O(N log N) , where N is the number of lines ... and the comparison of two lines is O(M) , where M scales with the length of the line, then you will be left:

 O(N log N) * O(M)

or

 O(MN log N)

where M is the line length and N is the number of lines. You want to use different labels because they do not mean the same thing.

In the strange case, when the average row length is scaled with the number of rows, for example, if you have a matrix stored in rows or something like that, you can say that M = N , and then you will have O(N^2 log N)

+6

Donald miner Feb 26 '12 at 0:34

source share

Sorting n items with MergeSort requires O(N LogN) . If the time for comparing the two elements is O(1) , then the total runtime will be O(N LogN) . However, comparing two strings of length N requires O(N) , so a naive implementation may depend on O(N*N logN) .

This seems wasteful because we are not using the fact that there are only N strings for comparison. We could somehow manipulate the strings so that comparisons on average take less time.

Here is an idea. Create a Trie structure and place N lines there. The trie will have O(N*N) nodes and it takes O(N*N) to build. Go through the tree and put the whole "ranking" in each node in the tree; If R (N1) <R (N2), then the line associated with Node1 precedes the line associated with Node2 in the dictionary.

Now continue with Mergesort, compare in O(1) time by looking at Trie. Total run time will be O(N*N + N*logN) = O(N*N)

Edit: My answer is very similar to @amit. However, I am starting to merge, where he continues to work with radixsort after the trie build phase.

0

Ali Ferhat Feb 26 '12 at 9:11

source share

amit · Accepted Answer · 2012-02-26T06:55:58+0000

Like @orangeoctopus, using the standard ranking algorithm in a collection of n rows of size n will compute O(n^2 * logn) .

However - note that you can do this in O(n^2) , with radix sort options. p>

The easiest way to do this [in my opinion] is to

create a trie and fill it with all your lines. incoming each line is O(n) and you do it n times - total O(n^2)
do DFS in trie, every time you come across a label for the end for a string - add it to the sorted collection. The order of the lines added this way is lexicographically, so your list will be sorted lexicographically when you are done.

It is easy to see that you cannot do this better than O(n^2) , since only reading data is O(n^2) , so this solution is optimal in terms of the large O-time complexity of time.

Sort strings using merge sort

More articles: