Python program using too much memory

I got these results from Heapy, but it's not clear what they mean.

Index Count % Size % Cumulative % Kind (class / dict of class) 0 262539 59 36961284 48 36961284 48 dict (no owner) 1 65536 15 34340864 45 71302148 93 dict of myobj.Container 2 65536 15 2097152 3 73399300 96 myobj.Container 

myobj is a class with 20 True / False values ​​and 20 numbers (all of which can be stored in 2 bytes).

I have an array of 256 * 256 of them. I really don't understand why they consume 35 or 70 MB of memory. I would like to bring it below 10 MB, if possible.

Most of the data inside the object is organized into dictionaries for easy access. Dictionaries themselves do not change and are pretty pointless. Will they impose large overheads?

Would it be useful to pack all the data on 1 number with bitwise operators? I should be able to store all object data in 32 or 64 bytes. I was hoping the compiler would do things automatically like other languages, but it seems to do the opposite.

The class inherits an object of the built-in type for no reason other than using decorators. Will it cause a lot of overhead?

Also curious is what dict (no owner) means and that it consumes the other half of the memory.

Edit: sys.getsizeof (myobj.Container) really reports 450 bytes! This is madness. I used dictionaries only because I need to access index-based data. As far as I know, the compiler should get rid of the structures and access the values ​​directly. Is there a better way to do this? (I don't think lists are the answer)

+4
source share
1 answer

Python does not eliminate the overhead of such structures. I'm sorry. Its dynamic nature such compiler optimizations are complex. But then I do not know a single language that would eliminate the overhead associated with the maintenance of things in dictionaries.

dict (without owner) probably includes all the dictionaries that you create inside your object. They are marked as non-owners because they are not dictionaries for object instances.

What can you do:

Use __slots__ if you add __slots__ = ('the','names','of','fields') as an attribute of the class, python will use a more efficient implementation of the class. He will get rid of the dictionary used to store attributes.

If your dictionaries can be rewritten to use lists that will improve the situation. Lists are more memory efficient than dictionaries.

For best performance, you should remake your system to use numpy arrays. Each attribute of your class will become an array of size 256 * 256. In this case, each element will be stored very efficiently in space.

Alternatively, you can check out PyPy. It provides an alternative python implementation with JIT, as well as various time / space optimizations that may help.

sys.getsizeof does not say what you think of its reports. sys.getsizeof(myobj.Container) reports the size of the class object, not the size of the actual container objects. You want sys.getsizeof(myobj.Container()) or the like. Even this is not so accurate, because there is nothing in it except the base object. It does not account for a dictionary containing attributes. It will only report the size of the third row in your report.

+6
source

Source: https://habr.com/ru/post/1381061/


All Articles