Schwartz sorting example in the section "Text Processing in Python"

I looked at "Text Processing in Python" and tried an example on Schwartzian sorting.

I used the following structure for sample data, which also contains empty rows. I sorted this data by the fifth column:
383230 -49 -78 1 100034 '06 text '9562' text '720' text '867
335067 -152 -18 3 100030 ' text '2400' text '2342' text '696
136592 21 230 3 100035 '03. text '10368' text '1838' text '977

Code used for Schwartz sorting:

for n in range(len(lines)):       # Create the transform
    lst = string.split(lines[n])
    if len(lst) >= 4:             # Tuple w/ sort info first
        lines[n] = (lst[4], lines[n])
    else:                         # Short lines to end
        lines[n] = (['\377'], lines[n])

lines.sort()    # Native sort

for n in range(len(lines)):       # Restore original lines
    lines[n] = lines[n][1]

open('tmp.schwartzian','w').writelines(lines)

, , , . if-else, . , , ( ), .

, ? , , ?

EDIT: '\ 377'. sort(), .

else:                         # Short lines to end
    lines[n] = (['\377'], lines[n])
print type(lines[n][0])
>>> (type 'list')

nosklo '\ 377' . !

, 2- , 0,95 0,09 , . !

0
5

, , .

, 4- . sort() . .

, 5 , else :

if len(lst) >= 4:             # Tuple w/ sort info first
    lines[n] = (lst[4], lines[n])
else:                         # Short lines to end
    lines[n] = (['\377'], lines[n])

['\377'] . , "\ 377" ( char ascii) , , 5- . .

, . , , , .

, :

sort_by_field(list_of_str, field_number, separator=' ', defaultvalue='\xFF')
    # decorates each value:
    for i, line in enumerate(list_of_str)):
        fields = line.split(separator)
        try:
             # places original line as second item:
            list_of_str[i] = (fields[field_number], line)
        except IndexError:
            list_of_str[i] = (defaultvalue, line)
    list_of_str.sort() # sorts list, in place
    # undecorates values:
    for i, group in enumerate(list_of_str))
        list_of_str[i] = group[1] # the second item is original line

, , .

+1

, , python ( 2,3 2.4, ) key sort() sorted(). :

def key_func(line):
    lst = string.split(line)
    if len(lst) >= 4:             
        return lst[4]
    else:                        
        return '\377'

lines.sort(key=key_func)
+2

if len(lst) >= 4:

['\ 377'] , , lst[4] (lst[0] ).

0

, , .

, "", (-). Nosklo wbg , , , , schwartzian , :

, .

, .

0

Although the Schwartz transform used is pretty outdated for Python, it's worth mentioning that you could write code in such a way as to avoid the possibility of a line with a line [4], starting with \377, sorting in the wrong place

for n in range(len(lines)):
    lst = lines[n].split()
    if len(lst)>4:
        lines[n] = ((0, lst[4]), lines[n])
    else:
        lines[n] = ((1,), lines[n])

Because tuples are compared differently, tuples starting with 1will be alwayssorted below.

Also note that the test should be len(list)>4instead>=

The same logic applies when using the modern equivalent AKA function key=

def key_func(line):
        lst = line.split()
        if len(lst)>4:
            return 0, lst[4]
        else:
            return 1,

lines.sort(key=key_func)
0
source

Source: https://habr.com/ru/post/1779391/


All Articles