Edit
See my complete answer at the bottom of this question.
tl; dr answer : Python has statically nested regions. the static aspect can interact with implicit variable declarations, which gives non-obvious results.
(This can be especially surprising due to the language, usually dynamic in nature).
I thought I had a good handle in the rules for defining Python, but this problem silenced me and my google-fu didn't help me (not that I was surprised - look at the title of the question;)
I'm going to start with a few examples that work as expected, but feel free to skip example 4 for the juicy part.
Example 1
>>> x = 3 >>> class MyClass(object): ... x = x ... >>> MyClass.x 3
Simple enough: while defining a class, we can access variables defined in an external (in this case global) area.
Example 2
>>> def mymethod(self): ... return self.x ... >>> x = 3 >>> class MyClass(object): ... x = x ... mymethod = mymethod ... >>> MyClass().mymethod() 3
Again (ignoring at the moment why this might be necessary), there is nothing unexpected here: we can access functions in the external area.
Note : as Frederick noted below, this function does not seem to work. See Example 5 (and further).
Example 3
>>> def myfunc(): ... x = 3 ... class MyClass(object): ... x = x ... return MyClass ... >>> myfunc().x Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in myfunc File "<stdin>", line 4, in MyClass NameError: name 'x' is not defined
This is essentially the same as in Example 1: we access the outer scope from the class definition, only this time the scope is not global, thanks to myfunc() .
Edit 5: As shown below, @ user3022222 , I messed up this example in my original post. I believe this fails because only functions (and not other blocks of code, such as defining this class) can access variables in the scope. For non-functional code blocks, only local, global, and inline variables are available. A more detailed explanation is available in this question.
Another:
Example 4
>>> def my_defining_func(): ... def mymethod(self): ... return self.y ... class MyClass(object): ... mymethod = mymethod ... y = 3 ... return MyClass ... >>> my_defining_func() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 4, in my_defining_func File "<stdin>", line 5, in MyClass NameError: name 'mymethod' is not defined
Um ... sorry?
How does this differ from example 2?
I am completely fooled. Please deal with me. Thank!
PS In case this is not just a problem with my understanding, I tried this on Python 2.5.2 and Python 2.6.2. Unfortunately, this is all I have access to at the moment, but they both exhibit the same behavior.
Edit According to http://docs.python.org/tutorial/classes.html#python-scopes-and-namespaces : at any time during runtime there are at least three nested areas whose namespaces are directly accessible:
- the innermost scale that the search first contains local names
- the scope of any function that runs from the nearest scope contains non-local, but also non-global names
- the next last area contains the current global module names
- appearance (last search) is a namespace containing embedded names
# 4. seems to be the opposite example for the second one.
Edit 2
Example 5
>>> def fun1(): ... x = 3 ... def fun2(): ... print x ... return fun2 ... >>> fun1()() 3
Edit 3
As Frederick noted, assigning a variable with the same name as in the outer scope, “masks” the external variable, preventing its execution.
So this modified version of example 4 works:
def my_defining_func(): def mymethod_outer(self): return self.y class MyClass(object): mymethod = mymethod_outer y = 3 return MyClass my_defining_func()
However, it is not:
def my_defining_func(): def mymethod(self): return self.y class MyClass(object): mymethod_temp = mymethod mymethod = mymethod_temp y = 3 return MyClass my_defining_func()
I still do not quite understand why this disguise occurs: should a name binding occur during the assignment?
This example at least contains some hint (and a more useful error message):
>>> def my_defining_func(): ... x = 3 ... def my_inner_func(): ... x = x ... return x ... return my_inner_func ... >>> my_defining_func()() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 4, in my_inner_func UnboundLocalError: local variable 'x' referenced before assignment >>> my_defining_func() <function my_inner_func at 0xb755e6f4>
Thus, it seems that the local variable is defined when the function is created (which succeeds), as a result, the local name is “reserved” and thus masks the name of the outer region when the function is called.
Interesting.
Thanks to Frederick for the answer (s)!
For reference: python docs :
It is important to understand that areas are defined by text: the global scope of a function defined in a module is the namespace of the modules, no question of where or by what the function is called. On the other hand, the actual name lookup is done dynamically, at run time - however, the language definition evolves towards static name resolution, with "compile" time, so don't rely on dynamic name resolution! (In fact, local variables are already defined statically.)
Change 4
Real answer
This seemingly confusing behavior is caused by Python's statically nested regions, as defined in PEP 227 . This is not actually related to PEP 3104 .
From PEP 227:
Name resolution rules are typical for languages with a fixed scope [...] [except] variables are not declared. If a name binding operation occurs anywhere in the function, then this name is considered to be local to the function and all links are local binding. If the link occurs before the name is bound, NameError is raised.
[...]
An example from Tim Peters demonstrates potential pitfalls of nested areas in the absence of announcements:
i = 6 def f(x): def g(): print i
A call to g () will refer to the variable i bound in f () for the loop. If g () is called before the loop is executed, a NameError will be raised.
Allows you to run two simpler versions of Tim's example:
>>> i = 6 >>> def f(x): ... def g(): ... print i ...
when g() does not find i in the inner region, it dynamically searches for the appearance, finding region i in f , which is tied to 3 through the assignment i = x .
But changing the order of the last two statements in f causes an error:
>>> i = 6 >>> def f(x): ... def g(): ... print i ...
Keeping in mind that PEP 227 said: “Name resolution rules are typical for languages with static domains,” let's look at the (semi) equivalent version C statement:
// nested.c
compile and run:
$ gcc nested.c -o nested $ ./nested 134520820 3
So, while C will happily use an unbound variable (using everything that was there, it was there before: 134520820, in this case), Python (fortunately) refuses.
As an interesting note, statically nested regions allow what Alex Martelli called "the single most important optimization performed by the Python compiler: local function variables are not stored in the dict, they are in a dense vector of values, and every local access to the variable uses an index in this vector , not a name search. "