How to use sort sorting (radix sorting, etc.) to sort strings?

Question

How to use sort sorting (radix sorting, etc.) to sort strings?

I know how to use radix sort to sort integers.

But how to use it to sort strings? or floating point numbers?

+4

sorting algorithm radix-sort

Jackson tale Mar 09 '12 at 17:44

source share

2 answers

Yes it is possible.

See Radix Sort, Sorting float data for floats. He uses the fact that floats cast for integer types are compared correctly (after eliminating negative values). See this article for more details.

For strings, you can solve the variable-length problem by sorting by the MSD method and make sure you stop going down when you encounter Nulls. See Radix Sort implemented in C ++ for a string .

+3

Ashelly Mar 09 '12 at 20:59

source share

Evgeny Kluev · Accepted Answer · 2012-03-09T19:58:01+0000

Radix sorting or any other distribution sorting can be used to sort floating point numbers if you ignore some of their features, such as infinity, non-numeric values and two different representations of zero. IEEE 754-2008 floating-point numbers have binary representations that are compatible in sort order with integers. Thus, if you exclude non-digits and rethink float or double as int32 or int64 , you can directly apply any sort to them. Edit: Negative floating point numbers need special handling (as indicated by AShelly) because their sort order is the opposite of the sort order of integers.

Using strings is harder due to their variable lengths. Another kind of distribution sorting (bucket sorting) can be used and is often used for strings. A few initial line characters are used to index the bucket, then any comparative sorting is used to sort the lines inside the buckets.

If all lines are almost equal in length and / or some method is used to enhance the differences between the lines (as described in Chapter 6, “FAST: Fast Architecture Sensitive Tree Search on Modern Processors and GPUs” ), then method sorting can also be used radix: divide the string into groups of characters (or better, groups of bits) of equal length, re-interpret these groups as integers and continue as if it were a radix sort for integers.

Edit:. All kinds of sorting distributions are guaranteed to work only for ASCII strings. Other string encodings may require a different sort order or may depend on the "collate" parameter for the locale.

How to use sort sorting (radix sorting, etc.) to sort strings?

More articles: