I constantly write simulation output to the HDFStore, which has become quite large (~ 15 GB). In addition, I get the following performance warning:
/home/extern/fsalah/.local/lib/python2.7/site-packages/tables/group.py:501: PerformanceWarning: group ``/`` is exceeding the recommended maximum number of children (16384); be ready to see PyTables asking for *lots* of memory and possibly slow I/O.
Now, what I'm experiencing is taking about 30 seconds to create a new child with a small data set (100 rows, 4 columns). But this only happens if I create it for the first time after opening HDFStore and if this child no longer exists. After adding the first new child to this HDF Store, adding more children works fine (<0.1 s). I can easily reproduce this behavior by closing and reopening the HDFStore. I am executing the following code snippet:
databaseName = "store.hdf5"
store = pd.HDFStore(databaseName, complib='zlib', complevel=9)
timeslots = np.arange(0,100)
df = pd.DataFrame({'Timeslot': timeslots,
'a': [x[t] for t in timeslots],
'b': [y[t] for t in timeslots],
'c': np.repeat(z, (len(timeslots)))})
tableName = "runX"
store.put(tableName, df, data_columns=['Timeslot', 'a', 'b', 'c'])
store.close()
Now my questions are:
, , ?
?
- (= ~ 5 ) - HDFStore ~ 3 ( ).
[ , ]