Python sort and empty strings

Hi, I am using the sorted () function in Python to order a two-dimensional array (I want to sort the columns in the same way as in a classic spreadsheet).

In the example below, I use itemgetter (0) to sort the grid based on the contents of the first column.

But sorted ones return empty lines before non-empty ones.

>>> import operator >>> res = [['charly','male','london'], ... ['bob','male','paris'], ... ['alice','female','rome'], ... ['','unknown','somewhere']] >>> sorted(res,key=operator.itemgetter(0)) [['', 'unknown', 'somewhere'], ['alice', 'female', 'rome'], ['bob', 'male', 'paris'], ['charly', 'male', 'london']] >>> 

while I will need this to return this:

 [['alice', 'female', 'rome'], ['bob', 'male', 'paris'], ['charly', 'male', 'london'], ['', 'unknown', 'somewhere']] 

Is there an easy way to do this?

+4
source share
4 answers

Use another key function. One of them will work:

 sorted(res, key=lambda x: (x[0] == "", x[0].lower())) 

The key is a tuple with 0 (False) or 1 (True) in the first position, where True indicates that the first element in the record is empty. The second position has a name field from your original record. Python then sorts first the groups of non-empty and empty names, and then by name in the non-empty name gorup. (Python also sorts by name in an empty name group, but since an empty name will not do anything.)

I also took the liberty of making the name case-insensitive, converting them all to lowercase in the key.

Just replacing empty names with “ZZZZZZ” or something “in alphabetical order” is tempting, but it fails for the first time when some joker puts his name as “ZZZZZZZZ” for the test. I think something like '\xff' * 100 might work, but it still seems to be a hack (also potential Unicode traps).

+18
source

You can pass a key function by returning the actual value, or 100 'z if the first element is empty (empty lines are evaluated to False .

 sorted(res, key= lambda x: x[0] if x[0] else 'z'*100 ) 
+1
source

This works, whether verbose:

 def cmp_str_emptylast(s1, s2): if not s1 or not s2: return bool(s2) - bool(s1) return cmp(s1, s2) sorted(res, key=operator.itemgetter(0), cmp=cmp_str_emptylast) 
0
source
 key=lambda x: x[0] if x[0] else '\xff\xff\xff\xff\xff\xff\xff\xff\xff' 
-2
source

Source: https://habr.com/ru/post/1397686/


All Articles