Python dictionary that saves only changes

I wrote code to optimize the design using the Inspyred library and its implementation of Genetic algorithms. In fact, the optimization process creates a large number of variations in one data structure, which is a nested dictionary in my case.

To reduce the amount of memory used in the process, I am trying to create some kind of differential type of dictionary that stores only elements that are different from the basic dictionary. The reason for this is that in a typical case, 95% of the data in the data structure will not be changed in any of the options, but any part of the data structure may contain changes. Therefore, for reasons of flexibility, I would like to have a data type that behaves more or less like a dictionary, but saves only the changes.

This is the result of my attempt to create this:

#!/usr/bin/python import unittest import copy global_base={} class DifferentialDict(object): """ dictionary with differential storage of changes all DifferentialDict objects have the same base dictionary """ def __init__(self,base=None): global global_base self.changes={} if not base==None: self.set_base(base) def set_base(self,base): global global_base global_base=copy.deepcopy(base) def __copy__(self): return self def __deepcopy__(self): new=DifferentialDict() new.changes=copy.deepcopy(self.changes) return new def get(self): global global_base outdict=copy.deepcopy(global_base) for key in self.changes: outdict[key]=self.changes[key] return outdict def __setitem__(self,key,value): self.changes[key]=value def __getitem__(self,key): global global_base if key in self.changes: return self.changes[key] else: return global_base[key] class TestDifferentialDict(unittest.TestCase): def test1(self): ldict={'a':{1:2,3:4},'b':{'c':[1,2,3],'d':'abc'}} ddict=DifferentialDict(base=ldict) self.assertEqual(ddict['a'],{1:2,3:4}) ddict['a']=5 self.assertEqual(ddict['a'],5) def test2(self): ldict={'a':{1:2,3:4},'b':{'c':[1,2,3],'d':'abc'}} ddict1=DifferentialDict(base=ldict) ddict2=DifferentialDict(base=ldict) ddict1['a'][3]=5 ddict2['a'][3]=7 self.assertEqual(ddict1['a'][3],5) self.assertEqual(ddict2['a'][3],7) def test3(self): ldict={'a':{1:2,3:4},'b':{'c':[1,2,3],'d':'abc'}} ddict1=DifferentialDict(base=ldict) ddict2=ddict1.__deepcopy__() ddict1['a'][3]=5 ddict2['a'][3]=7 self.assertEqual(ddict1['a'][3],5) self.assertEqual(ddict2['a'][3],7) if __name__ == "__main__": unittest.main() 

It works great for a simple dictionary, but breaks when new dictionaries are embedded in the main dictionary. I understand that this is because these second-level dictionaries are real Python dictionaries, not instances of my DifferentialDict, which leads to overwriting entries in global_base, and not to changes in self.changes. However, they must be due to the premise that all instances of DifferentialDict have the same basic dictionary. I can add an “entry level” key to each instance of DifferentialDict, but I feel that there is a more elegant solution that eludes me.

I would really appreciate any suggestions on how to make my differential dictionary work nested. Thanks in advance!

+5
source share
1 answer

I don’t have time to try this right now (maybe a little later), but here are two observations:

combined indexes

If you would use tuples as indexes, for example, like this dict[(5,3,2)] , you would not have this problem. If you base either your base dict or differential dicts on this, you can work around this problem.

Perhaps you can even write some classes that rewrite dict[a][b][c] into dict[(a,b,c)] to make this internal change transparent.

global base

I do not understand why you are using a global base. From my point of view, this makes the code more complex without adding anything. Why don't you just save the base, as in:

 def MyDict(collections.abc.MutableSequence): def __init__(self, base): self._base = base my_global_base = dict() d = MyDict(my_global_base) d[2] = 'abc' # modifies self._base inside of the instance too, because it is the # same object 

If you want to change the entire content of the database, just delete all the elements with popitem() , and then add new ones using update() . Thus, your code is more flexible and does not have any surprising behavior due to global variables.

abstract base classes

When redefining classes, such as sequences, dictionaries, etc., it can be useful to use the abstract base classes provided by Python, they do some of the implementation work for you.

+3
source

Source: https://habr.com/ru/post/1244208/


All Articles