Removing a key / table in HDF storage using Python

Is there a pyTables method similar to the following:

with pd.get_store(my_store) as store: keys = store.keys() rem_key = min(sorted(keys)) store.remove(rem_key) 

I essentially try to access the list of HDF5 storage keys, find the one that is no longer needed (in this case min () if the storage keys were dates, for example), and then delete this key from the store, keeping the rest.

Pandas doesn't seem to have anything for this, and I looked at the pyTables methods to no avail, having read that they affect the HDF functionality in python.

Thanks!

+5
source share
1 answer

Pandas does exactly what you want. The remove function is part of pandas/io/pytables.py (available for v0.19.1 here ), and it will remove the node key, or lines within the node, by condition.

HDF5 does not adjust the size of your store after deletion (see SO answer ), therefore it is recommended to re-compress / restructure your store from time to time, you can do this from the command line using (from SO answer ):

 ptrepack --chunkshape=auto --propindexes --complib=blosc test.h5 out.h5 
+4
source

Source: https://habr.com/ru/post/1235063/


All Articles