Python optimizes a nested for loop with the addition of

I have 2 for loops that will run mostly for big data. I want to optimize this and maximize speed.

source = [['row1', 'row2', 'row3'],['Product', 'Cost', 'Quantity'],['Test17', '3216', '17'], ['Test18' , '3217' , '18' ], ['Test19', '3218', '19' ], ['Test20', '3219', '20']]

generator object creation

it = iter(source)
variables = ['row2', 'row3']
variables_indices = [1, 2]
getkey = rowgetter(*key_indices)
for row in it:
    k = getkey(row)
    for v, i in zip(variables, variables_indices):
        try:
            o = list(k)  # populate with key values initially
            o.append(v)  # add variable
            o.append(row[i]) # add value
            yield tuple(o)
        except IndexError:
            pass

def rowgetter(*indices):
    if len(indices) == 0:
        #print("STEP 7")
        return lambda row: tuple()
    elif len(indices) == 1:
        #print("STEP 7")
        # if   only one index, we cannot use itemgetter, because we want a
        # singleton sequence to be returned, but itemgetter with a single
        # argument returns the value itself, so let define a function
        index = indices[0]
        return lambda row: (row[index],) 

    else:

        return operator.itemgetter(*indices)

This will return the tuple, but it takes so much time on average 100 seconds for 100,000 rows (in the example, the source has 5 rows). Can someone help reduce this time please.

note: I also tried inline loops and a list comprehension that doesn't return for every iteration

+4
source share
2 answers

Some improvements are noted below, but they do not alter the algorithmic complexity:

zipped = list(zip(variables, variables_indices))  # create once and reuse

for row in it:
    for v in zipped:
        try:
            yield (*getkey(row), v, row[i])  # avoid building list and tuple conversion 
        except IndexError:
            pass
+2
source

list out k, , tuple .

, k, . tuple, :

k = [1,2,3,4]

def make_tuple(k,a,b):
    def gen(k,a,b):
        yield from k
        yield a
        yield b
    return tuple(gen(k,a,b))

result = make_tuple(k,12,14)

:

(1, 2, 3, 4, 12, 14)
+1

Source: https://habr.com/ru/post/1692200/


All Articles