Temporary variable in list comprehension

I often happen to have a piece of code that looks like this.

raw_data = [(s.split(',')[0], s.split(',')[1]) for s in all_lines if s.split(',')[1] != '"NaN"'] 

Basically, I would like to know if there is a way to create a temporary variable of type splitted_s to avoid the need to repeat operations on a looped object (for example, in this case you need to split it three times).

+11
source share
4 answers

If you have two actions to process, you can embed a different list comprehension:

 raw_data = [(lhs, rhs) for lhs, rhs in [s.split(',')[:2] for s in all_lines] if rhs != '"NaN"'] 

You can use the generator internally (it also gives a slight increase in performance):

  in (s.split(',')[:2] for s in all_lines) 

This will be even faster than your implementation:

 import timeit setup = '''import random, string; all_lines = [','.join((random.choice(string.letters), str(random.random() if random.random() > 0.3 else '"NaN"'))) for i in range(10000)]''' oneloop = '''[(s.split(',')[0], s.split(',')[1]) for s in all_lines if s.split(',')[1] != '"NaN"']''' twoloops = '''raw_data = [(lhs, rhs) for lhs, rhs in [s.split(',') for s in all_lines] if rhs != '"NaN"']''' timeit.timeit(oneloop, setup, number=1000) # 7.77 secs timeit.timeit(twoloops, setup, number=1000) # 4.68 secs 
+9
source

You can not.

Understanding a list consists of brackets containing an expression, followed by a for clause, and then zero or more for or if the clause. The result will be a new list obtained by evaluating the expression in the context of the following for and if clauses.

From here

Assignment in Python is not an expression.

As Padraic Cunningham writes, if you need to break it down several times, don't do it in the list comprehension.

+1
source

Starting with Python 3.8 and introducing assignment expressions (PEP 572) ( := operator), you can use a local variable within the list comprehension to avoid calling the same expression twice:

In our case, we can name the line.split(',') score as the parts variable, using the result of the expression to filter the list if parts[1] not NaN ; and thus reuse parts to get the mapped value:

 # lines = ['1,2,3,4', '5,NaN,7,8'] [(parts[0], parts[1]) for line in lines if (parts := line.split(','))[1] != 'NaN'] # [('1', '2')] 
0
source

Late editing

Now, one year wiser and understanding the question ... I would suggest a simple

 raw_data = [tuple(ssp[:2]) for s in all_lines for ssp in [s.split(',')] if ssp[1]!='"NaN"'] 

which works correctly because [s.split(',')] is a list whose only element is the list returned by s.split(',') , and righmost / inner loop, for ssp in [s.split(',')] - roughly equivalent to temporary assignment, ssp = s.split('',)


My initial answer

<sub> I have some problem in understanding the question, but if you want to use a temporary variable in understanding the list, put the value (or expression) that you need in the list, all alone! and use a different sub list comprehension >

 In [1]: [a*b for b in [10] for a in [1,2,3,4,5]] Out[1]: [10, 20, 30, 40, 50] In [2]: 

<sub> The rightmost understanding is what is in the inner loop, so if you have a function that uses a lot of time to calculate the time value that should be used in list comprehension, e; g; next ;-) sub>

 In [2]: def long_computation(x): print 1 ; return x 

<sub> then the next two constructs return exactly the same list, but ... sub>

 In [3]: [a*b for b in [long_computation(10)] for a in [1,2,3,4,5]] 1 Out[3]: [10, 20, 30, 40, 50] In [4]: [a*b for a in [1,2,3,4,5] for b in [long_computation(10)]] 1 1 1 1 1 Out[4]: [10, 20, 30, 40, 50] 
-1
source

Source: https://habr.com/ru/post/985767/


All Articles