Closing can be implemented in several ways. One of them is to actually capture environments ... in other words, consider an example
def foo(x): y = 1 z = 2 def bar(a): return (x, y, a) return bar
The env-capture solution is as follows:
foo , and a local frame is created containing the names x , y , z , bar . The name x bound to the parameter, the name y and z to 1 and 2, the name bar to close- The closure assigned to
bar actually captures the entire parent frame, so when it calls it, it can look for the name a in its own local frame and can look for x and y instead in the captured parent frame.
With this approach (that is, not the approach used by Python), the variable z will remain alive as long as the closure remains alive, even if the closure does not refer to it.
Another option, a little more difficult to implement, is instead:
- during compilation, the code is parsed, and the closure assigned to
bar is detected when writing the names x and y from the current area. - these two variables are therefore classified as “cells” and they are allocated separately from the local frame
- the closure stores the address of these variables, and each access to them requires double access (the cell is a pointer to where the value is actually stored).
To do this, you need to pay a little extra time when the closure is created, because each individual captured cell must be copied inside the closure object (instead of simply copying the pointer to the parent frame), but has the advantage of not capturing the integer so, for example, z does not will survive after returning foo , only x and y will be.
This is what Python does ... basically at compile time, when a closure (or a named function or lambda ) is detected, subcompilation is performed. At compile time, when there is a search that resolves the parent function, the variable is marked as a cell.
One small annoyance is that when you take a parameter (for example, in the foo example) in the prolog, you need to perform an additional copy operation to convert the passed value to a cell. This in Python is not displayed in bytecode, but is executed directly by the calling machine.
Another annoyance is that every access to the captured variable requires double indirectness, even in the parent context.
The advantage is that closures only capture really bound variables, and when they do not capture any generated code as efficiently as a regular function.
To find out how this works in Python, you can use the dis module to check the generated bytecode:
>>> dis.dis(foo) 2 0 LOAD_CONST 1 (1) 3 STORE_DEREF 1 (y) 3 6 LOAD_CONST 2 (2) 9 STORE_FAST 1 (z) 4 12 LOAD_CLOSURE 0 (x) 15 LOAD_CLOSURE 1 (y) 18 BUILD_TUPLE 2 21 LOAD_CONST 3 (<code object bar at 0x7f6ff6582270, file "<stdin>", line 4>) 24 LOAD_CONST 4 ('foo.<locals>.bar') 27 MAKE_CLOSURE 0 30 STORE_FAST 2 (bar) 6 33 LOAD_FAST 2 (bar) 36 RETURN_VALUE >>>
as you can see that the generated code stores 1 in y using STORE_DEREF (an operation that writes to the cell using double indirectness) and instead stores 2 in z using STORE_FAST ( z not captured and is only local in the current frame ) When the code foo starts executing, x has already been wrapped in a cell by the calling machine.
bar is only a local variable, therefore STORE_FAST used to write to it, but to create a closure, x and y must be copied individually (they are placed in the tuple before calling the MAKE_CLOSURE operation MAKE_CLOSURE ).
The closing code can be seen with:
>>> dis.dis(foo(12)) 5 0 LOAD_DEREF 0 (x) 3 LOAD_DEREF 1 (y) 6 LOAD_FAST 0 (a) 9 BUILD_TUPLE 3 12 RETURN_VALUE
and you can see that inside the returned closure, x and y are accessible with LOAD_DEREF . Regardless of how many “up” levels in the hierarchy of nested functions are determined by the variable, this is really just double feedback, because the price is paid when constructing the closure. Closed variables are only a little slower for access (by a constant coefficient) with respect to local networks ... no “scope chain” should intersect at run time.
Compilers that are even more complex, such as SBCL (an optimizing compiler for Common Lisp that generates native code), also perform "escape analysis" to determine if the closure can actually survive in the closing function. When this does not happen (that is, if bar used only inside foo and is not saved or not returned), the cells can be allocated on the stack instead of the heap, reducing the amount of consing time (allocating objects on the heap that requires garbage disposal).
This difference is found in the literature known as “descending / ascending”; those. if the captured variables are visible only at lower levels (i.e., in the closure or in deeper closures created inside the closure) or also at the upper levels (i.e. if my caller can access my captured locals).
To solve the problem of the rising lamp, you need a garbage collector, and why C ++ closing does not provide this opportunity.