Can I restore a function whose closure contains loops in Python?

I am trying to serialize Python functions (code + closure) and restore them later. I am using the code at the bottom of this post.

This is a very flexible code. It allows you to serialize and deserialize internal functions, as well as functions that are closures, such as those that need to be restored to their context:

def f1(arg): def f2(): print arg def f3(): print arg f2() return f3 x = SerialiseFunction(f1(stuff)) # a string save(x) # save it somewhere # later, possibly in a different process x = load() # get it from somewhere newf2 = DeserialiseFunction(x) newf2() # prints value of "stuff" twice 

These calls will work even if there are functions in closing your function, functions in closing them, etc. (we have a closure schedule where closures contain functions with closures that contain more functions, etc.).

However, it turns out that these graphs may contain cycles:

 def g1(): def g2(): g2() return g2() g = g1() 

If I look at closing g2 (via g ), I can see g2 in it:

 >>> g <function g2 at 0x952033c> >>> g.func_closure[0].cell_contents <function g2 at 0x952033c> 

This causes a serious problem when I try to deserialize a function, because everything is the same. I need to make a newg2 function:

 newg2 = types.FunctionType(g2code, globals, closure=newg2closure) 

where newg2closure is created as follows:

 newg2closure = (make_cell(newg2),) 

which, of course, is impossible; each line of code depends on the other. Cells are immutable, tuples are immutable, function types are immutable.

So I'm trying to figure out if there is a way to create newg2 above? Is there a way to create an object of type function where this object is mentioned in its own closing schedule?

I am using python 2.7 (I'm on App Engine, so I can't upgrade to Python 3).


For reference, my serialization functions:

 def SerialiseFunction(aFunction): if not aFunction or not isinstance(c, types.FunctionType): raise Exception ("First argument required, must be a function") def MarshalClosureValues(aClosure): logging.debug(repr(aClosure)) lmarshalledClosureValues = [] if aClosure: lclosureValues = [lcell.cell_contents for lcell in aClosure] lmarshalledClosureValues = [ [marshal.dumps(litem.func_code), MarshalClosureValues(litem.func_closure)] if hasattr(litem, "func_code") else [marshal.dumps(litem)] for litem in lclosureValues ] return lmarshalledClosureValues lmarshalledFunc = marshal.dumps(aFunction.func_code) lmarshalledClosureValues = MarshalClosureValues(aFunction.func_closure) lmoduleName = aFunction.__module__ lcombined = (lmarshalledFunc, lmarshalledClosureValues, lmoduleName) retval = marshal.dumps(lcombined) return retval def DeserialiseFunction(aSerialisedFunction): lmarshalledFunc, lmarshalledClosureValues, lmoduleName = marshal.loads(aSerialisedFunction) lglobals = sys.modules[lmoduleName].__dict__ def make_cell(value): return (lambda x: lambda: x)(value).func_closure[0] def UnmarshalClosureValues(aMarshalledClosureValues): lclosure = None if aMarshalledClosureValues: lclosureValues = [ marshal.loads(item[0]) if len(item) == 1 else types.FunctionType(marshal.loads(item[0]), lglobals, closure=UnmarshalClosureValues(item[1])) for item in aMarshalledClosureValues if len(item) >= 1 and len(item) <= 2 ] lclosure = tuple([make_cell(lvalue) for lvalue in lclosureValues]) return lclosure lfunctionCode = marshal.loads(lmarshalledFunc) lclosure = UnmarshalClosureValues(lmarshalledClosureValues) lfunction = types.FunctionType(lfunctionCode, lglobals, closure=lclosure) return lfunction 
+6
source share
1 answer

The method works here.

You cannot fix these immutable objects, but what you can do is use the proxy functions instead of circular links and make them look for a real function in the global dictionary.

1: When serializing, keep track of all the features you saw. If you see the same thing again, do not re-serialize, but serialize the control value.

I used the set:

 lfunctionHashes = set() 

and for each serialized element, check if it is in the set, go with the sentinel, if so, otherwise add it to the set and the marshal is correct:

 lhash = hash(litem) if lhash in lfunctionHashes: lmarshalledClosureValues.append([lhash, None]) else: lfunctionHashes.add(lhash) lmarshalledClosureValues.append([lhash, marshal.dumps(litem.func_code), MarshalClosureValues(litem.func_closure, lfullIndex), litem.__module__]) 

2: when deserializing, hold the global dict of functionhash: function

 gfunctions = {} 

During deserialization, every time you deserialize a function, add it to gfunctions. Here's the element (hash, code, shortvalues, modulename):

 lfunction = types.FunctionType(marshal.loads(item[1]), globals, closure=UnmarshalClosureValues(item[2])) gfunctions[item[0]] = lfunction 

And when you come across a control value for a function, use a proxy server, passing the hash of the function:

 lfunction = make_proxy(item[0]) 

Here is the proxy. It is looking for a real hash based function:

 def make_proxy(f_hash): def f_proxy(*args, **kwargs): global gfunctions f = lfunctions[f_hash] f(*args, **kwargs) return f_proxy 

I also had to make a few more changes:

  • I used pickle instead of marshal in some places, could study it further
  • I include the module name in serialization, as well as code and closure, so I can find the correct global functions for the function when deserializing.
  • In deserialization, the length of the tuple tells you that you are deserializing: 1 for a simple value, 2 for a function that requires proxying, 4 for a fully serialized function

Here is the complete new code.

 lfunctions = {} def DeserialiseFunction(aSerialisedFunction): lmarshalledFunc, lmarshalledClosureValues, lmoduleName = pickle.loads(aSerialisedFunction) lglobals = sys.modules[lmoduleName].__dict__ lglobals["lfunctions"] = lfunctions def make_proxy(f_hash): def f_proxy(*args, **kwargs): global lfunctions f = lfunctions[f_hash] f(*args, **kwargs) return f_proxy def make_cell(value): return (lambda x: lambda: x)(value).func_closure[0] def UnmarshalClosureValues(aMarshalledClosureValues): global lfunctions lclosure = None if aMarshalledClosureValues: lclosureValues = [] for item in aMarshalledClosureValues: ltype = len(item) if ltype == 1: lclosureValues.append(pickle.loads(item[0])) elif ltype == 2: lfunction = make_proxy(item[0]) lclosureValues.append(lfunction) elif ltype == 4: lfuncglobals = sys.modules[item[3]].__dict__ lfuncglobals["lfunctions"] = lfunctions lfunction = types.FunctionType(marshal.loads(item[1]), lfuncglobals, closure=UnmarshalClosureValues(item[2])) lfunctions[item[0]] = lfunction lclosureValues.append(lfunction) lclosure = tuple([make_cell(lvalue) for lvalue in lclosureValues]) return lclosure lfunctionCode = marshal.loads(lmarshalledFunc) lclosure = UnmarshalClosureValues(lmarshalledClosureValues) lfunction = types.FunctionType(lfunctionCode, lglobals, closure=lclosure) return lfunction def SerialiseFunction(aFunction): if not aFunction or not hasattr(aFunction, "func_code"): raise Exception ("First argument required, must be a function") lfunctionHashes = set() def MarshalClosureValues(aClosure, aParentIndices = []): lmarshalledClosureValues = [] if aClosure: lclosureValues = [lcell.cell_contents for lcell in aClosure] lmarshalledClosureValues = [] for index, litem in enumerate(lclosureValues): lfullIndex = list(aParentIndices) lfullIndex.append(index) if isinstance(litem, types.FunctionType): lhash = hash(litem) if lhash in lfunctionHashes: lmarshalledClosureValues.append([lhash, None]) else: lfunctionHashes.add(lhash) lmarshalledClosureValues.append([lhash, marshal.dumps(litem.func_code), MarshalClosureValues(litem.func_closure, lfullIndex), litem.__module__]) else: lmarshalledClosureValues.append([pickle.dumps(litem)]) lmarshalledFunc = marshal.dumps(aFunction.func_code) lmarshalledClosureValues = MarshalClosureValues(aFunction.func_closure) lmoduleName = aFunction.__module__ lcombined = (lmarshalledFunc, lmarshalledClosureValues, lmoduleName) retval = pickle.dumps(lcombined) return retval 
+3
source

Source: https://habr.com/ru/post/976402/


All Articles