I am looking for a more compact way to store logical data. numpy internally needs 8 bits to store one boolean, but np.packbitsallowing them to be packaged is pretty cool.
The problem is that for a packet in a 4e6 byte array, an array of 32e6 bytes of a boolean value, we must first spend 256e6 bytes to convert a logical array in an int array!
In [1]: db_bool = np.array(np.random.randint(2, size=(int(2e6), 16)), dtype=bool)
In [2]: db_int = np.asarray(db_bool, dtype=int)
In [3]: db_packed = np.packbits(db_int, axis=0)
In [4]: db.nbytes, db_int.nbytes, db_packed.nbytes
Out[5]: (32000000, 256000000, 4000000)
There is one year issue discovered in the numpy tracking log about this (see
https://github.com/numpy/numpy/issues/5377 )
Does anyone have a solution / best workaround?
The trace when we try to do it right:
In [28]: db_pb = np.packbits(db_bool)
TypeError Traceback (most recent call last)
<ipython-input-28-3715e167166b> in <module>()
TypeError: Expected an input array of integer data type
In [29]:
PS: bitarray , .