I'm trying to sort a string of letters and numbers in the alphanumeric way in an "intuitive" / natural way with the unix sort
command, but I can't get it to sort correctly. I have this file:
$ cat ~/headers @42EBKAAXX090828:6:100:1699:328/2 @42EBKAAXX090828:6:10:1077:1883/2 @42EBKAAXX090828:6:102:785:808/2
I would like to sort it in an alphanumeric way, where first intuitively @42EBKAAXX090828:6:10:...
(since 10
less than 100
and 102
), the second is @42EBKAAXX090828:6:100...
and the third is @42EBKAAXX090828:6:102:204:1871/2
.
I know that I propose sorting by a certain position inside the line, but the position :
it can change here, and therefore this will not be a general and feasible solution here.
I tried:
sort --stable -k1,1 ~/headers > foo
with various combinations of the -n
and -u
options, but does not give the correct order.
How can this be done efficiently, either from bash using sort
, or from Python? I would like to apply this to 4-5 GB files, so they contain millions of lines.
Thanks!
source share