Trying to get double precision floating point evaluation from a UTF-8 encoded string object in Python. The idea is to capture the first 8 bytes of a string and create a float , so that the strings ordered by their count will be lexicographically sorted according to their first 8 bytes (or, possibly, their first 63 bits, after they force everything, to be positive, to avoid sign errors).
For instance:
get_score(u'aaaaaaa') < get_score(u'aaaaaaab') < get_score(u'zzzzzzzz')
I tried to compute the score in integer form using left-shift-bit and XOR, but I'm not sure how to translate this value into a float value. I am also not sure if there is a better way to do this.
How to calculate the score for a string so that the condition above is met?
Edit: The string object is encoded in UTF-8 encoding (according to @Bakuriu commment).
source share