How are dict objects distorted?

After reading the documentation for the brine, I got the impression that the class must implement either __reduce__ or __getstate__ in order to properly soak. But how does dictionary etching work? They do not have any of these attributes:

 > dict(a=1).__reduce__() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/daniyar/work/Apr24/<ipython-input-30-bc1cbd43305b> in <module>() ----> 1 dict(a=1).__reduce__() /usr/lib/python2.6/copy_reg.pyc in _reduce_ex(self, proto) 68 else: 69 if base is self.__class__: ---> 70 raise TypeError, "can't pickle %s objects" % base.__name__ 71 state = base(self) 72 args = (self.__class__, base, state) TypeError: can't pickle dict objects > dict(a=1).__getstate__() --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /home/daniyar/work/Apr24/<ipython-input-31-00932fb40067> in <module>() ----> 1 dict(a=1).__getstate__() AttributeError: 'dict' object has no attribute '__getstate__' 

Also, how do classes derived from dict pickle?

+6
source share
5 answers

The brine module processes several types “initially” . Types that it does not handle initially will have to execute the "pickle protocol" . Dictations and simple subclasses are processed initially.

+8
source

The __reduce__ and __getstate__ must be the lower limit of the etching methods so that you can implement your custom classes when it needs some special treatment with the interpreter.

For example, if an instance of an extension class is inside a dictionary that you are trying to determine what makes the entire dictionary inaccessible, if your class does not implement these methods, talking about how to sort it.

The interpreter knows how to sort the built-in functions, and you must use the pickle.dump or pickle.dumps method to pickle.dump pickle.dumps , and not by calling __reduce__ or __getstate__ .

+3
source

Etching does not require either __reduce__ or __getstate__ . These are methods that you can use to control etching, but the brine will work on built-in types, without them it’s just fine.

+3
source

The helpful answer I received from here

This is what should be inside __getstate__ and __setstate__ . Even if you somehow cannot use it right away, as it should have, but you can do it from scratch, for example:

 def __getstate__(self): result = self.__dict__.copy() return result def __setstate__(self, dict): self.__dict__ = dict 
+1
source

All are good answers, but they ignore the question:

Also, how do classes derived from dict pickle?

They are pickled by reference, like any other class. If you look at the pickle, you will see what python does.

 >>> class MyDict(dict): ... def __repr__(self): ... return "MyDict({})".format(dict(i for i in self.items())) ... >>> m = MyDict(a=1,b=2) >>> m MyDict({'a': 1, 'b': 2}) >>> import pickle >>> # reconstructor called on class MyDict that lives in __main__ >>> # and contains a __builtin__ dict with contents ('a' and 'b') >>> pickle.dumps(m) "ccopy_reg\n_reconstructor\np0\n(c__main__\nMyDict\np1\nc__builtin__\ndict\np2\n(dp3\nS'a'\np4\nI1\nsS'b'\np5\nI2\nstp6\nRp7\n." >>> m.clear() >>> # removing the contents, to show how that affects the pickle >>> pickle.dumps(m) 'ccopy_reg\n_reconstructor\np0\n(c__main__\nMyDict\np1\nc__builtin__\ndict\np2\n(dp3\ntp4\nRp5\n.' >>> # now, just looking at the class itself, you can see it by reference >>> pickle.dumps(MyDict) 'c__main__\nMyDict\np0\n.' 

Or we could do the same, but take a look at the sorted pickles. You see exactly what instructions are stored.

 >>> pickletools.dis(pickle.dumps(m)) 0: c GLOBAL 'copy_reg _reconstructor' 25: p PUT 0 28: ( MARK 29: c GLOBAL '__main__ MyDict' 46: p PUT 1 49: c GLOBAL '__builtin__ dict' 67: p PUT 2 70: ( MARK 71: d DICT (MARK at 70) 72: p PUT 3 75: t TUPLE (MARK at 28) 76: p PUT 4 79: R REDUCE 80: p PUT 5 83: . STOP highest protocol among opcodes = 0 >>> pickletools.dis(pickle.dumps(MyDict)) 0: c GLOBAL '__main__ MyDict' 17: p PUT 0 20: . STOP highest protocol among opcodes = 0 

The class is definitely stored by reference, even if it is considered from dict instead of object . A reference to the name, which means that after closing the __main__ session, the class definition will be lost, and the brine, which depends on MyClass , will not be loaded.

Now look at the dict . A dict , primarily relying on python, knowing how to serialize fundamental objects like dict (as mentioned in other answers), then goes on to serialize the content. You can see that there are two strings , which python also inherently knows how to serialize.

This means that if you have unserializable objects in a dict, this will not work.

 >>> d['c'] = MyDict.__repr__ >>> d {'a': 1, 'c': <unbound method MyDict.__repr__>, 'b': 2} >>> pickle.dumps(d) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps Pickler(file, protocol).dump(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump self.save(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict self._batch_setitems(obj.iteritems()) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems save(v) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save rv = reduce(self.proto) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle instance method objects 

We can do better, by the way, if you use the best serializer. Using dill instead of pickle allows most objects to be serialized. The brine of the dict is much more complicated, as you can see below.

 >>> import dill >>> dill.dumps(d) '\x80\x02}q\x00(U\x01aq\x01K\x01U\x01cq\x02cdill.dill\n_load_type\nq\x03U\nMethodTypeq\x04\x85q\x05Rq\x06cdill.dill\n_create_function\nq\x07(cdill.dill\n_unmarshal\nq\x08T$\x01\x00\x00c\x01\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00C\x00\x00\x00s#\x00\x00\x00d\x01\x00j\x00\x00t\x01\x00d\x02\x00\x84\x00\x00|\x00\x00j\x02\x00\x83\x00\x00D\x83\x01\x00\x83\x01\x00\x83\x01\x00S(\x03\x00\x00\x00Ns\n\x00\x00\x00MyDict({})c\x01\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00s\x00\x00\x00s\x15\x00\x00\x00|\x00\x00]\x0b\x00}\x01\x00|\x01\x00V\x01q\x03\x00d\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x02\x00\x00\x00t\x02\x00\x00\x00.0t\x01\x00\x00\x00i(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>s\t\x00\x00\x00<genexpr>\x03\x00\x00\x00s\x02\x00\x00\x00\x06\x00(\x03\x00\x00\x00t\x06\x00\x00\x00formatt\x04\x00\x00\x00dictt\x05\x00\x00\x00items(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__repr__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\t\x85q\nRq\x0bc__builtin__\n__main__\nU\x08__repr__q\x0cNN}q\rtq\x0eRq\x0fNcdill.dill\n_create_type\nq\x10(h\x03U\x08TypeTypeq\x11\x85q\x12Rq\x13U\x06MyDictq\x14h\x03U\x08DictTypeq\x15\x85q\x16Rq\x17\x85q\x18}q\x19(U\n__module__q\x1aU\x08__main__q\x1bh\x0ch\x0fU\x07__doc__q\x1cNutq\x1dRq\x1e\x87q\x1fRq U\x01bq!K\x02u.' >>> pickletools.dis(dill.dumps(d)) 0: \x80 PROTO 2 2: } EMPTY_DICT 3: q BINPUT 0 5: ( MARK 6: U SHORT_BINSTRING 'a' 9: q BINPUT 1 11: K BININT1 1 13: U SHORT_BINSTRING 'c' 16: q BINPUT 2 18: c GLOBAL 'dill.dill _load_type' 40: q BINPUT 3 42: U SHORT_BINSTRING 'MethodType' 54: q BINPUT 4 56: \x85 TUPLE1 57: q BINPUT 5 59: R REDUCE 60: q BINPUT 6 62: c GLOBAL 'dill.dill _create_function' 90: q BINPUT 7 92: ( MARK 93: c GLOBAL 'dill.dill _unmarshal' 115: q BINPUT 8 117: T BINSTRING 'c\x01\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00C\x00\x00\x00s#\x00\x00\x00d\x01\x00j\x00\x00t\x01\x00d\x02\x00\x84\x00\x00|\x00\x00j\x02\x00\x83\x00\x00D\x83\x01\x00\x83\x01\x00\x83\x01\x00S(\x03\x00\x00\x00Ns\n\x00\x00\x00MyDict({})c\x01\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00s\x00\x00\x00s\x15\x00\x00\x00|\x00\x00]\x0b\x00}\x01\x00|\x01\x00V\x01q\x03\x00d\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x02\x00\x00\x00t\x02\x00\x00\x00.0t\x01\x00\x00\x00i(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>s\t\x00\x00\x00<genexpr>\x03\x00\x00\x00s\x02\x00\x00\x00\x06\x00(\x03\x00\x00\x00t\x06\x00\x00\x00formatt\x04\x00\x00\x00dictt\x05\x00\x00\x00items(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__repr__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01' 414: q BINPUT 9 416: \x85 TUPLE1 417: q BINPUT 10 419: R REDUCE 420: q BINPUT 11 422: c GLOBAL '__builtin__ __main__' 444: U SHORT_BINSTRING '__repr__' 454: q BINPUT 12 456: N NONE 457: N NONE 458: } EMPTY_DICT 459: q BINPUT 13 461: t TUPLE (MARK at 92) 462: q BINPUT 14 464: R REDUCE 465: q BINPUT 15 467: N NONE 468: c GLOBAL 'dill.dill _create_type' 492: q BINPUT 16 494: ( MARK 495: h BINGET 3 497: U SHORT_BINSTRING 'TypeType' 507: q BINPUT 17 509: \x85 TUPLE1 510: q BINPUT 18 512: R REDUCE 513: q BINPUT 19 515: U SHORT_BINSTRING 'MyDict' 523: q BINPUT 20 525: h BINGET 3 527: U SHORT_BINSTRING 'DictType' 537: q BINPUT 21 539: \x85 TUPLE1 540: q BINPUT 22 542: R REDUCE 543: q BINPUT 23 545: \x85 TUPLE1 546: q BINPUT 24 548: } EMPTY_DICT 549: q BINPUT 25 551: ( MARK 552: U SHORT_BINSTRING '__module__' 564: q BINPUT 26 566: U SHORT_BINSTRING '__main__' 576: q BINPUT 27 578: h BINGET 12 580: h BINGET 15 582: U SHORT_BINSTRING '__doc__' 591: q BINPUT 28 593: N NONE 594: u SETITEMS (MARK at 551) 595: t TUPLE (MARK at 494) 596: q BINPUT 29 598: R REDUCE 599: q BINPUT 30 601: \x87 TUPLE3 602: q BINPUT 31 604: R REDUCE 605: q BINPUT 32 607: U SHORT_BINSTRING 'b' 610: q BINPUT 33 612: K BININT1 2 614: u SETITEMS (MARK at 5) 615: . STOP highest protocol among opcodes = 2 

dill serializes a class method because there are additional functions that have been registered with dill that know how to pickle and paste a wider range of objects — you can see them in the parsed code (they start with dill.dill ). This is a much larger pickle, but it usually works for any content you put in a dict .

 >>> from numpy import * >>> everything = dill.dumps(globals()) 

For classes that come from dict , you don’t need to worry that there are unusual objects inside the class methods, however, the contents of the custom dict are still serialized with the class instance, so you should worry about your class containing non-serializable objects.

 Python 2.7.9 (default, Dec 11 2014, 01:21:43) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> import pickle >>> class MyDict(dict): ... def __repr__(self): ... return "MyDict({})".format(dict(i for i in self.items())) ... >>> m = MyDict(a = lambda x:x) >>> m MyDict({'a': <function <lambda> at 0x10892b230>}) >>> pickle.dumps(a) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'a' is not defined >>> pickle.dumps(m) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps Pickler(file, protocol).dump(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump self.save(obj) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 331, in save self.save_reduce(obj=obj, *rv) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 401, in save_reduce save(args) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 562, in save_tuple save(element) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict self._batch_setitems(obj.iteritems()) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems save(v) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 748, in save_global (obj, module, name)) pickle.PicklingError: Can't pickle <function <lambda> at 0x10892b230>: it not found as __main__.<lambda> 

A lambda cannot serialize because it does not have a name pickle can refer to. Returning to the dill , we see that this works.

 >>> import dill >>> dill.dumps(m) '\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x06MyDictq\x05h\x01U\x08DictTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\r__slotnames__q\x0b]q\x0cU\n__module__q\rU\x08__main__q\x0eU\x08__repr__q\x0fcdill.dill\n_create_function\nq\x10(cdill.dill\n_unmarshal\nq\x11T$\x01\x00\x00c\x01\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00C\x00\x00\x00s#\x00\x00\x00d\x01\x00j\x00\x00t\x01\x00d\x02\x00\x84\x00\x00|\x00\x00j\x02\x00\x83\x00\x00D\x83\x01\x00\x83\x01\x00\x83\x01\x00S(\x03\x00\x00\x00Ns\n\x00\x00\x00MyDict({})c\x01\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00s\x00\x00\x00s\x15\x00\x00\x00|\x00\x00]\x0b\x00}\x01\x00|\x01\x00V\x01q\x03\x00d\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x02\x00\x00\x00t\x02\x00\x00\x00.0t\x01\x00\x00\x00i(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>s\t\x00\x00\x00<genexpr>\x03\x00\x00\x00s\x02\x00\x00\x00\x06\x00(\x03\x00\x00\x00t\x06\x00\x00\x00formatt\x04\x00\x00\x00dictt\x05\x00\x00\x00items(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__repr__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x12\x85q\x13Rq\x14c__builtin__\n__main__\nh\x0fNN}q\x15tq\x16Rq\x17U\x07__doc__q\x18Nutq\x19Rq\x1a)\x81q\x1bU\x01aq\x1ch\x10(h\x11U\\c\x01\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00C\x00\x00\x00s\x04\x00\x00\x00|\x00\x00S(\x01\x00\x00\x00N(\x00\x00\x00\x00(\x01\x00\x00\x00t\x01\x00\x00\x00x(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00<lambda>\x01\x00\x00\x00s\x00\x00\x00\x00q\x1d\x85q\x1eRq\x1fc__builtin__\n__main__\nU\x08<lambda>q NN}q!tq"Rq#s}q$b.' 
+1
source

Source: https://habr.com/ru/post/914418/


All Articles