List of subclasses

I want to create a DataSet class, which is basically a list of samples. But I need to override each insert operation in the DataSet.

Is there any easy way to do this without writing my own application, extension, iadd, etc.?

UPDATE: I want to add a back pointer to each sample by holding the sample pointer in the DataSet. This is necessary for the processing algorithm that I use. I have a solution, but it seems fuzzy - the renumber () function - this ensures that the backpointers are valid.

+4
source share
1 answer

I do not know how to do what you ask - redefining mutators, not redefining them. However, using the class decorator, you can "automate" overriding versions (provided that each of them can be achieved by wrapping the corresponding method in the base class), so this is not so bad ...

Suppose, for example, that you want to add a “changed” flag, true, if the data may have been changed since the last call to .save (a method that saves data and sets self.modified to False).

Then...

 def wrapMethod(cls, n): f = getattr(cls, n) def wrap(self, *a): self.dirty = True return f(self, *a) return wrap def wrapListMutators(cls): for n in '''__setitem__ __delitem__ __iadd__ __imul__ append extend insert pop remove reverse sort'''.split(): f = wrapMethod(cls, n) setattr(cls, n, f) return cls @wrapListMutators class DataSet(list): dirty = False def save(self): self.dirty = False 

This syntax requires Python 2.6 or better, but in earlier versions of Python (which only support decoders in def statements, not class expressions or even very old ones that don't support decorators at all), you just need to change the very last part ( class statement ) so that:

 class DataSet(list): dirty = False def save(self): self.dirty = False DataSet = wrapListMutators(DataSet) 

IOW, the neat decorator syntax is just a small amount of sugar syntax on top of a regular function call that takes a class as an argument and reassigns it.

Edit: Now that you have edited your question to clarify your exact requirements, save the bp field on each element, so for all i , theset[i].bp == i - it’s easier to weigh the pro and con of the different approaches.

You can adapt the approach that I sketched, but instead of assigning self.dirty before calling the wrapped method, you should call self.renumber() after it, that is:

 def wrapMethod(cls, n): f = getattr(cls, n) def wrap(self, *a): temp = f(self, *a) self.renumber() return temp return wrap 

this meets your stated requirements, but in many cases it will do a lot more work than necessary: ​​for example, when you append an element, it will uselessly “recount” all existing ones (to the same values ​​that they already had), but how can any fully automated approach to “know” which items, if any, should .bp without O(N) effort? At least he should look at each of them (since you do not want to code separately, for example, append vs insert & c), and that is already O(N) .

Thus, this will be acceptable only if for each individual change in the list there is O(N) (basically, only if the list always remains small and / or does not change often).

A more fruitful idea might not be to support .bp values ​​all the time, but only “just in time” when necessary. Set the bp a property (read-only) by calling a method that checks if the container is dirty (where the dirty flag in the container is supported using the automated code that I already gave), and only then renumbers (and sets its dirty attribute to False ).

This will work well when the list usually undergoes batch change, and only then you need to access the bp elements for a while, then another group of changes, etc. Such a sharp alternation between the change and reading is not uncommon in real containers , but only you can find out if it is applicable in your particular case!

To improve performance, I think you need to do manual coding on top of this general approach in order to take advantage of frequent special cases. For example, append can be called very often, and the amount of work performed in special cases of append is really small, so it may well cost you to write these two or three lines of code (not setting a dirty bit for this case).

One warning: no approach will work (indeed, your requirement becomes self-contradictory) if any element is present twice in the list - which, of course, is completely possible if you do not take precautions to avoid it (you can easily diagnose it in renumber - keeping the set of elements already seen and creating an exception for any duplication - if it is not too late for you, it is more difficult to diagnose on the fly, that is, during a mutation that causes duplication, if that is, h then you need to). Perhaps you can remove your requirement that if an element is present twice, this is normal, and bp may indicate one of the indices; or make bp into a set of indices in which the element is present (which will also provide a smooth approach to the case of getting bp from an element that is not in the list). Etc, etc .; I recommend that you consider (and the document !) All of these corner cases in depth - correctness before execution!

+5
source

Source: https://habr.com/ru/post/1285968/


All Articles