I have a very large file, more than 100 GB (many billions of lines), and I would like to carry out two-level sorting as quickly as possible in a Unix-system with limited memory. This will be one step in a large perl script, so I would like to use perl if possible.
So how can I do this? My data is as follows:
A 129 B 192 A 388 D 148 D 911 A 117
... but for billions of lines. I need to sort by letter first and then by number. It would be easier to use unix sort, for example ...
sort -k1,2 myfile
Or can I do it all in perl? My system will have approximately 16 GB of memory, but the file is about 100 GB.
Thanks for any suggestions!
source share