Understanding a problem with namedtuple typename and pickle in Python

Earlier today I had problems finding a namedtuple instance. As a health check, I tried to run the code that was sent in another answer . Here it is simplified a little more:

from collections import namedtuple import pickle P = namedtuple("P", "one two three four") def pickle_test(): abe = P("abraham", "lincoln", "vampire", "hunter") f = open('abe.pickle', 'w') pickle.dump(abe, f) f.close() pickle_test() 

Then I changed two lines of this to use my named tuple:

 from collections import namedtuple import pickle P = namedtuple("my_typename", "ABC") def pickle_test(): abe = P("ONE", "TWO", "THREE") f = open('abe.pickle', 'w') pickle.dump(abe, f) f.close() pickle_test() 

However it gave me an error

  File "/path/to/anaconda/lib/python2.7/pickle.py", line 748, in save_global (obj, module, name)) pickle.PicklingError: Can't pickle <class '__main__.my_typename'>: it not found as __main__.my_typename 

i.e. the Pickle module is looking for my_typename . I changed the line P = namedtuple("my_typename", "ABC") to P = namedtuple("P", "ABC") , and it worked.

I looked at the source namedtuple.py , and in the end we have something that looks relevant, but I do not quite understand what is happening:

 # For pickling to work, the __module__ variable needs to be set to the frame # where the named tuple is created. Bypass this step in enviroments where # sys._getframe is not defined (Jython for example) or sys._getframe is not # defined for arguments greater than 0 (IronPython). try: result.__module__ = _sys._getframe(1).f_globals.get('__name__', '__main__') except (AttributeError, ValueError): pass return result 

So my question is what exactly is going on? Why should the typename argument match the factory name for this?

+6
source share
1 answer

In the section entitled What can you pickle and sprinkle? The Python documentation indicates that only “classes that are defined at the top level of a module” can be pickled. However, namedtuple() is a factory function that effectively defines the class ( my_typename(tuple) in your second example), however it does not assign the manufactured type to a variable named my_typename at the top level of the module.

This is because pickle only stores the “fully qualified” name of such things, and not their code, and they must be import accessible from the module in which they use this name in order to be able to scatter later (therefore, the requirement that the module must contain a named object at the top level).

This can be illustrated by seeing one way around the problem, which would be to change one line of code so that the type named my_typename defined at the top level:

 P = my_typename = namedtuple("my_typename", "ABC") 

Alternatively, you can simply give namedtuple name "P" instead of "my_typename" :

 P = namedtuple("P", "ABC") 

As for the source code namedtuple.py you were looking for: it tries to determine the name of the module, the caller ( namedtuple creator) is located, because the author knows that pickle can try to use it to import definition that needs to be unlocked, and that people usually assign the result to a variable with the same name that they passed to the factory function (but you did not specify in the second example).

+8
source

Source: https://habr.com/ru/post/958635/


All Articles