Understanding the Python List - Want to Avoid Re-Evaluating

I have a list that comes close to:

[f(x) for x in l if f(x)] 

Where l is a list and f (x) is an expensive function that returns a list.

I want to avoid evaluating f (x) twice for each non-empty filling f (x). Is there a way to keep my result in a list comprehension?

I could remove the final condition, generate the entire list and then crop it, but that seems wasteful.

Edit

Two main approaches have been proposed:

Internal understanding of the generator:

 [y for y in (f(x) for x in l) if y] 

or memoization.

I think the internal understanding of the generator is elegant for the problem, as indicated. Actually I simplified the question to make it clear I really want:

 [g(x, f(x)) for x in l if f(x)] 

For this more complex situation, I think memoization gives a cleaner end result.

+52
python list-comprehension code-readability
Apr 04 '13 at 13:32
source share
12 answers

The solution (the best if you have a repeating value of x) is to memoize the function f, that is, create a wrapper function that saves the argument by which the function is called, and saves it than returning it if the same value is given.

A really simple implementation is as follows:

 storage = {} def memoized(value): if value not in storage: storage[value] = f(value) return storage[value] [memoized(x) for x in l if memoized(x)] 

and then use this function in the list comprehension. This approach is valid under two conditions: one theoretical and one practical. The first is that the function f must be deterministic, i.e. It returns the same results with the same input, and the other - in that the object x can be used as a dictionary of keys. If the first of them is invalid, you must recalculate f each time by definition, and if the second fails, you can use slightly more reliable approaches.

You can find many memoization implementations on the net, and I think there is something in the new versions of python that is included in them.

On a side note, never use small L as a variable name, this is a bad habit, as it can be confused with i or 1 on some terminals.

EDIT:

as commented, a possible solution using generators (to avoid creating useless duplicate times) would be this expression:

 [g(x, fx) for x, fx in ((x,f(x)) for x in l) if fx] 

You need to weigh your choice, given the computational cost f, the number of duplicates in the original list, and the memory at your location. Remembering makes a compromise between cosmic speeds, which means that it keeps track of every result, keeping it, so if you have huge lists, it can become costly on the front of the memory.

+11
Apr 04 '13 at 13:40
source share
 [y for y in (f(x) for x in l) if y] 

I will do it.

+41
Apr 04 '13 at
source share

You must use the memoize decorator. Here is an interesting link .




Using memoization from a link and your 'code':

 def memoize(f): """ Memoization decorator for functions taking one or more arguments. """ class memodict(dict): def __init__(self, f): self.f = f def __call__(self, *args): return self[args] def __missing__(self, key): ret = self[key] = self.f(*key) return ret return memodict(f) @memoize def f(x): # your code [f(x) for x in l if f(x)] 
+11
Apr 04 '13 at
source share
 [y for y in [f(x) for x in l] if y] 

For your updated issue, this might be useful:

 [g(x,y) for x in l for y in [f(x)] if y] 
+9
Apr 04 '13 at 13:34 on
source share

Nope. There is no (pure) way to do this. There is nothing wrong with a good old-fashioned loop:

 output = [] for x in l: result = f(x) if result: output.append(result) 

If you find it difficult to read, you can always wrap it in a function.

+8
Apr 04 '13 at 13:34 on
source share

As the previous answers showed, you can use double understanding or use memoization. For issues with reasonable size, this is a matter of taste (and I agree that memoization looks cleaner since it hides optimization). But if you study a very large list, there is a huge difference: Memoization will store every calculated value and can quickly deflate your memory. A double understanding with a generator (round parsers, not square brackets) preserves only what you want to keep.

To get to your actual problem:

 [g(x, f(x)) for x in series if f(x)] 

To calculate the final value, you need both x and f(x) . No problem, pass them like this:

 [g(x, y) for (x, y) in ( (x, f(x)) for x in series ) if y ] 

Again: use a generator (round parens) rather than a list (square brackets). Otherwise, you will create an entire list before filtering the results. This is the list comprehension version:

 [g(x, y) for (x, y) in [ (x, f(x)) for x in series ] if y ] # DO NOT USE THIS 
+8
Apr 04
source share

You can use memoization . This is a method that is used to avoid performing the same calculation twice, saving somewhere the result for each calculated value. I saw that there is already an answer that uses memoization, but I would like to suggest a general implementation using python decorators:

 def memoize(func): def wrapper(*args): if args in wrapper.d: return wrapper.d[args] ret_val = func(*args) wrapper.d[args] = ret_val return ret_val wrapper.d = {} return wrapper @memoize def f(x): ... 

Now f is a memoized version of itself. With this implementation, you can memoize any function using the @memoize decorator.

+3
Apr 04 '13 at 13:44 on
source share

There were many answers to memoizing. The Python 3 standard library now has lru_cache , which is the last cache used. So you can:

 from functools import lru_cache @lru_cache() def f(x): # function body here 

Thus, your function will be called only once. You can also specify the size of lru_cache , the default is 128. The problem with the memoize decorators shown above is that the size of the lists can increase significantly.

+3
Sep 10 '14 at 3:56
source share

Use map() !!

 comp = [x for x in map(f, l) if x] 

f is a function f(X) , l is a list

map() will return the result f(X) for each x in the list.

+3
Sep 15 '17 at 19:26
source share

Here is my solution:

 filter(None, [f(x) for x in l]) 
+1
Apr 04 '13 at 14:17
source share

Starting with Python 3.8 and introducing assignment expressions (PEP 572) ( := operator), you can use a local variable within the list comprehension to avoid calling the same function twice:

In our case, we can name the estimate f(x) as the variable y , using the result of the expression to filter the list, but also as the displayed value:

 [y for x in l if (y := f(x))] 
0
Apr 27 '19 at 15:26
source share

How about determining:

 def truths(L): """Return the elements of L that test true""" return [x for x in L if x] 

For example,

 > [wife.children for wife in henry8.wives] [[Mary1], [Elizabeth1], [Edward6], [], [], []] > truths(wife.children for wife in henry8.wives) [[Mary1], [Elizabeth1], [Edward6]] 
-2
Apr 04 '13 at 19:22
source share



All Articles