All are good answers, but they ignore the question:
Also, how do classes derived from dict pickle?
They are pickled by reference, like any other class. If you look at the pickle, you will see what python does.
>>> class MyDict(dict): ... def __repr__(self): ... return "MyDict({})".format(dict(i for i in self.items())) ... >>> m = MyDict(a=1,b=2) >>> m MyDict({'a': 1, 'b': 2}) >>> import pickle >>>
Or we could do the same, but take a look at the sorted pickles. You see exactly what instructions are stored.
>>> pickletools.dis(pickle.dumps(m)) 0: c GLOBAL 'copy_reg _reconstructor' 25: p PUT 0 28: ( MARK 29: c GLOBAL '__main__ MyDict' 46: p PUT 1 49: c GLOBAL '__builtin__ dict' 67: p PUT 2 70: ( MARK 71: d DICT (MARK at 70) 72: p PUT 3 75: t TUPLE (MARK at 28) 76: p PUT 4 79: R REDUCE 80: p PUT 5 83: . STOP highest protocol among opcodes = 0 >>> pickletools.dis(pickle.dumps(MyDict)) 0: c GLOBAL '__main__ MyDict' 17: p PUT 0 20: . STOP highest protocol among opcodes = 0
The class is definitely stored by reference, even if it is considered from dict
instead of object
. A reference to the name, which means that after closing the __main__
session, the class definition will be lost, and the brine, which depends on MyClass
, will not be loaded.
Now look at the dict
. A dict
, primarily relying on python, knowing how to serialize fundamental objects like dict
(as mentioned in other answers), then goes on to serialize the content. You can see that there are two strings
, which python also inherently knows how to serialize.
This means that if you have unserializable objects in a dict, this will not work.
>>> d['c'] = MyDict.__repr__ >>> d {'a': 1, 'c': <unbound method MyDict.__repr__>, 'b': 2} >>> pickle.dumps(d) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps Pickler(file, protocol).dump(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump self.save(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict self._batch_setitems(obj.iteritems()) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems save(v) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save rv = reduce(self.proto) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle instance method objects
We can do better, by the way, if you use the best serializer. Using dill
instead of pickle
allows most objects to be serialized. The brine of the dict is much more complicated, as you can see below.
>>> import dill >>> dill.dumps(d) '\x80\x02}q\x00(U\x01aq\x01K\x01U\x01cq\x02cdill.dill\n_load_type\nq\x03U\nMethodTypeq\x04\x85q\x05Rq\x06cdill.dill\n_create_function\nq\x07(cdill.dill\n_unmarshal\nq\x08T$\x01\x00\x00c\x01\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00C\x00\x00\x00s#\x00\x00\x00d\x01\x00j\x00\x00t\x01\x00d\x02\x00\x84\x00\x00|\x00\x00j\x02\x00\x83\x00\x00D\x83\x01\x00\x83\x01\x00\x83\x01\x00S(\x03\x00\x00\x00Ns\n\x00\x00\x00MyDict({})c\x01\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00s\x00\x00\x00s\x15\x00\x00\x00|\x00\x00]\x0b\x00}\x01\x00|\x01\x00V\x01q\x03\x00d\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x02\x00\x00\x00t\x02\x00\x00\x00.0t\x01\x00\x00\x00i(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>s\t\x00\x00\x00<genexpr>\x03\x00\x00\x00s\x02\x00\x00\x00\x06\x00(\x03\x00\x00\x00t\x06\x00\x00\x00formatt\x04\x00\x00\x00dictt\x05\x00\x00\x00items(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__repr__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\t\x85q\nRq\x0bc__builtin__\n__main__\nU\x08__repr__q\x0cNN}q\rtq\x0eRq\x0fNcdill.dill\n_create_type\nq\x10(h\x03U\x08TypeTypeq\x11\x85q\x12Rq\x13U\x06MyDictq\x14h\x03U\x08DictTypeq\x15\x85q\x16Rq\x17\x85q\x18}q\x19(U\n__module__q\x1aU\x08__main__q\x1bh\x0ch\x0fU\x07__doc__q\x1cNutq\x1dRq\x1e\x87q\x1fRq U\x01bq!K\x02u.' >>> pickletools.dis(dill.dumps(d)) 0: \x80 PROTO 2 2: } EMPTY_DICT 3: q BINPUT 0 5: ( MARK 6: U SHORT_BINSTRING 'a' 9: q BINPUT 1 11: K BININT1 1 13: U SHORT_BINSTRING 'c' 16: q BINPUT 2 18: c GLOBAL 'dill.dill _load_type' 40: q BINPUT 3 42: U SHORT_BINSTRING 'MethodType' 54: q BINPUT 4 56: \x85 TUPLE1 57: q BINPUT 5 59: R REDUCE 60: q BINPUT 6 62: c GLOBAL 'dill.dill _create_function' 90: q BINPUT 7 92: ( MARK 93: c GLOBAL 'dill.dill _unmarshal' 115: q BINPUT 8 117: T BINSTRING 'c\x01\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00C\x00\x00\x00s#\x00\x00\x00d\x01\x00j\x00\x00t\x01\x00d\x02\x00\x84\x00\x00|\x00\x00j\x02\x00\x83\x00\x00D\x83\x01\x00\x83\x01\x00\x83\x01\x00S(\x03\x00\x00\x00Ns\n\x00\x00\x00MyDict({})c\x01\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00s\x00\x00\x00s\x15\x00\x00\x00|\x00\x00]\x0b\x00}\x01\x00|\x01\x00V\x01q\x03\x00d\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x02\x00\x00\x00t\x02\x00\x00\x00.0t\x01\x00\x00\x00i(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>s\t\x00\x00\x00<genexpr>\x03\x00\x00\x00s\x02\x00\x00\x00\x06\x00(\x03\x00\x00\x00t\x06\x00\x00\x00formatt\x04\x00\x00\x00dictt\x05\x00\x00\x00items(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__repr__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01' 414: q BINPUT 9 416: \x85 TUPLE1 417: q BINPUT 10 419: R REDUCE 420: q BINPUT 11 422: c GLOBAL '__builtin__ __main__' 444: U SHORT_BINSTRING '__repr__' 454: q BINPUT 12 456: N NONE 457: N NONE 458: } EMPTY_DICT 459: q BINPUT 13 461: t TUPLE (MARK at 92) 462: q BINPUT 14 464: R REDUCE 465: q BINPUT 15 467: N NONE 468: c GLOBAL 'dill.dill _create_type' 492: q BINPUT 16 494: ( MARK 495: h BINGET 3 497: U SHORT_BINSTRING 'TypeType' 507: q BINPUT 17 509: \x85 TUPLE1 510: q BINPUT 18 512: R REDUCE 513: q BINPUT 19 515: U SHORT_BINSTRING 'MyDict' 523: q BINPUT 20 525: h BINGET 3 527: U SHORT_BINSTRING 'DictType' 537: q BINPUT 21 539: \x85 TUPLE1 540: q BINPUT 22 542: R REDUCE 543: q BINPUT 23 545: \x85 TUPLE1 546: q BINPUT 24 548: } EMPTY_DICT 549: q BINPUT 25 551: ( MARK 552: U SHORT_BINSTRING '__module__' 564: q BINPUT 26 566: U SHORT_BINSTRING '__main__' 576: q BINPUT 27 578: h BINGET 12 580: h BINGET 15 582: U SHORT_BINSTRING '__doc__' 591: q BINPUT 28 593: N NONE 594: u SETITEMS (MARK at 551) 595: t TUPLE (MARK at 494) 596: q BINPUT 29 598: R REDUCE 599: q BINPUT 30 601: \x87 TUPLE3 602: q BINPUT 31 604: R REDUCE 605: q BINPUT 32 607: U SHORT_BINSTRING 'b' 610: q BINPUT 33 612: K BININT1 2 614: u SETITEMS (MARK at 5) 615: . STOP highest protocol among opcodes = 2
dill
serializes a class method because there are additional functions that have been registered with dill
that know how to pickle and paste a wider range of objects — you can see them in the parsed code (they start with dill.dill
). This is a much larger pickle, but it usually works for any content you put in a dict
.
>>> from numpy import * >>> everything = dill.dumps(globals())
For classes that come from dict
, you don’t need to worry that there are unusual objects inside the class methods, however, the contents of the custom dict
are still serialized with the class instance, so you should worry about your class containing non-serializable objects.
Python 2.7.9 (default, Dec 11 2014, 01:21:43) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> import pickle >>> class MyDict(dict): ... def __repr__(self): ... return "MyDict({})".format(dict(i for i in self.items())) ... >>> m = MyDict(a = lambda x:x) >>> m MyDict({'a': <function <lambda> at 0x10892b230>}) >>> pickle.dumps(a) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'a' is not defined >>> pickle.dumps(m) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps Pickler(file, protocol).dump(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump self.save(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save self.save_reduce(obj=obj, *rv) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 401, in save_reduce save(args) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 562, in save_tuple save(element) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict self._batch_setitems(obj.iteritems()) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems save(v) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 748, in save_global (obj, module, name)) pickle.PicklingError: Can't pickle <function <lambda> at 0x10892b230>: it not found as __main__.<lambda>
A lambda
cannot serialize because it does not have a name pickle
can refer to. Returning to the dill
, we see that this works.
>>> import dill >>> dill.dumps(m) '\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x06MyDictq\x05h\x01U\x08DictTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\r__slotnames__q\x0b]q\x0cU\n__module__q\rU\x08__main__q\x0eU\x08__repr__q\x0fcdill.dill\n_create_function\nq\x10(cdill.dill\n_unmarshal\nq\x11T$\x01\x00\x00c\x01\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00C\x00\x00\x00s#\x00\x00\x00d\x01\x00j\x00\x00t\x01\x00d\x02\x00\x84\x00\x00|\x00\x00j\x02\x00\x83\x00\x00D\x83\x01\x00\x83\x01\x00\x83\x01\x00S(\x03\x00\x00\x00Ns\n\x00\x00\x00MyDict({})c\x01\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00s\x00\x00\x00s\x15\x00\x00\x00|\x00\x00]\x0b\x00}\x01\x00|\x01\x00V\x01q\x03\x00d\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x02\x00\x00\x00t\x02\x00\x00\x00.0t\x01\x00\x00\x00i(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>s\t\x00\x00\x00<genexpr>\x03\x00\x00\x00s\x02\x00\x00\x00\x06\x00(\x03\x00\x00\x00t\x06\x00\x00\x00formatt\x04\x00\x00\x00dictt\x05\x00\x00\x00items(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__repr__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x12\x85q\x13Rq\x14c__builtin__\n__main__\nh\x0fNN}q\x15tq\x16Rq\x17U\x07__doc__q\x18Nutq\x19Rq\x1a)\x81q\x1bU\x01aq\x1ch\x10(h\x11U\\c\x01\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00C\x00\x00\x00s\x04\x00\x00\x00|\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x01\x00\x00\x00t\x01\x00\x00\x00x(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00<lambda>\x01\x00\x00\x00s\x00\x00\x00\x00q\x1d\x85q\x1eRq\x1fc__builtin__\n__main__\nU\x08<lambda>q NN}q!tq"Rq#s}q$b.'