The data is a python dict representing the state of something that slowly changes over time. Values change frequently, usually one or two items at a time. Keys can also change, but this is a rare event. After each change, a new data set is remembered for further study.
The result is a long sequence with increasing timestamps. A very simple example “b” turns on and off and on again:
(timestamp1, {'a':False, 'b':False, 'c':False}),
(timestamp2, {'a':False, 'b':True, 'c':False}),
(timestamp3, {'a':False, 'b':False, 'c':False}),
(timestamp4, {'a':False, 'b':True, 'c':False}),
This sequence is very convenient for work, but, obviously, quite inefficient. Almost the same data is copied over and over. A real dict has about 100 items. This is why I am looking for another way to store data history both in memory and on disk.
I am sure it has been many times in the past. Is there a standard / recommended way for this problem? The solution does not have to be perfect. Good enough.
This is what I would do if some soul did not show a better approach. Saving only incremental changes is spatially effective:
(timestamp1, FULL, {'a':False, 'b':False, 'c':False}),
(timestamp2, INCREMENTAL, {'b':True}),
(timestamp3, INCREMENTAL, {'b':False}),
(timestamp4, INCREMENTAL, {'b':True}),
However, accessing data is not easy because it needs to be restored in a few steps from the last FULL state. To limit the flaw, each Nth entry will be saved as FULL, and all the rest will be INCREMENTAL.
I would add this slight improvement: adding a link to the same state that is already recorded to prevent duplication:
(timestamp1, FULL, {'a':False, 'b':False, 'c':False}),
(timestamp2, INCREMENTAL, {'b':True}),
(timestamp3, SAME_AS, timestamp1),
(timestamp4, SAME_AS, timestamp2),