For several aspects of the project, using the h5 repository would be ideal. However, the files become massive, and frankly, we are running out of space.
This statement...
store.put(storekey, data, table=False, compression='gzip')
makes no difference in file size than ...
store.put(storekey, data, table=False)
Is compression used even when going through Pandas?
... if this is not possible, I am not opposed to using h5py, however I am not sure what to put for the "data type", since the DataFrame contains all kinds of types (strings, float, int, etc ..)
Any help / understanding would be appreciated!
source share