Python lazy variables? or, deferred costly computing

Question

Python lazy variables? or, deferred costly computing

I have a set of arrays that are very large and expensive to compute, and not all of my code will necessarily need it at any start. I would like to make their declaration optional, but ideally without having to rewrite all the code.

An example of how this is done:

x = function_that_generates_huge_array_slowly(0) y = function_that_generates_huge_array_slowly(1)

An example of what I would like to do:

 x = lambda: function_that_generates_huge_array_slowly(0) y = lambda: function_that_generates_huge_array_slowly(1) z = x * 5 # this doesn't work because lambda is a function # is there something that would make this line behave like # z = x() * 5? g = x * 6

When using lambda, as indicated above, one of the desired effects is achieved - the calculation of the array is delayed until it is needed. If you use the variable "x" more than once, it needs to be calculated every time. I would like to calculate it only once.

EDIT: After some additional searching, it seems like you can do what I want (approximately) with the “lazy” attributes in the class (like http://code.activestate.com/recipes/131495-lazy-attributes/ ). I don’t think there is a way to do something like this without making a separate class?

EDIT2: I'm trying to implement some of the solutions, but I ran into a problem because I don't understand the difference between:

 class sample(object): def __init__(self): class one(object): def __get__(self, obj, type=None): print "computing ..." obj.one = 1 return 1 self.one = one()

and

 class sample(object): class one(object): def __get__(self, obj, type=None): print "computing ... " obj.one = 1 return 1 one = one()

I think that some changes in them are what I am looking for, since expensive variables are meant to be part of the class.

+6

python numpy

keflavich Aug 22 '11 at 18:32

source share

3 answers

Writing a class is more reliable, but optimizing for simplicity (which I think you're asking for), I came up with the following solution:

 cache = {} def expensive_calc(factor): print 'calculating...' return [1, 2, 3] * factor def lookup(name): return ( cache[name] if name in cache else cache.setdefault(name, expensive_calc(2)) ) print 'run one' print lookup('x') * 2 print 'run two' print lookup('x') * 2

+6

Gringo suave Aug 22 '11 at 19:16

source share

You cannot make a simple name like x to evaluate lazily. A name is just an entry in a hash table (for example, that returns locals() or globals() ). If you do not apply access methods to these system tables, you cannot attach the execution of your code to a simple name resolution.

But you can wrap functions in caching shells in different ways. This is the OO way:

 class CachedSlowCalculation(object): cache = {} # our results def __init__(self, func): self.func = func def __call__(self, param): already_known = self.cache.get(param, None) if already_known: return already_known value = self.func(param) self.cache[param] = value return value calc = CachedSlowCalculation(function_that_generates_huge_array_slowly) z = calc(1) + calc(1)**2 # only calculates things once

This is a classless way:

 def cached(func): func.__cache = {} # we can attach attrs to objects, functions are objects def wrapped(param): cache = func.__cache already_known = cache.get(param, None) if already_known: return already_known value = func(param) cache[param] = value return value return wrapped @cached def f(x): print "I'm being called with %r" % x return x + 1 z = f(9) + f(9)**2 # see f called only once

In the real world, you will add some logic to keep a cache of a reasonable size, possibly using the LRU algorithm.

+1

9000 Aug 22 '11 at 20:31

source share

agf · Accepted Answer · 2011-08-22T18:47:13+0000

The first half of your problem (reusing a value) is easily solved:

 class LazyWrapper(object): def __init__(self, func): self.func = func self.value = None def __call__(self): if self.value is None: self.value = self.func() return self.value lazy_wrapper = LazyWrapper(lambda: function_that_generates_huge_array_slowly(0))

but you still have to use it as lazy_wrapper() not lasy_wrapper .

If you intend to access multiple variables multiple times, this might be faster to use:

 class LazyWrapper(object): def __init__(self, func): self.func = func def __call__(self): try: return self.value except AttributeError: self.value = self.func() return self.value

which will make the first call slower and subsequent use faster.

Edit: I see that you have found a similar solution that requires the use of attributes in a class. In any case, you need to rewrite each access to lazy variables, so just choose what you like.

Edit 2: You can also do:

 class YourClass(object) def __init__(self, func): self.func = func @property def x(self): try: return self.value except AttributeError: self.value = self.func() return self.value

if you want to access x as an instance attribute. No extra class required. If you do not want to change the class signature (this requires func , you can hard-code the function call in the property.

Python lazy variables? or, deferred costly computing

More articles: