Hdf5 for pandas dataframe

Question

Hdf5 for pandas dataframe

I uploaded a dataset that is stored in .h5 files. I need to store only certain columns and be able to manipulate the data in it.

To do this, I tried loading it into the pandas framework. I tried to use:

pd.read_hdf(path)

But I get: No dataset in HDF5 file.

I found answers to SO ( read HDF5 file on pandas DataFrame with conditions ), but I do not need conditions, and the answer adds conditions as the file was written, but I am not the creator of the file, so I can not do anything about it.

I also tried using h5py:

df = h5py.File(path)

But this is not easy to manipulate, and I cannot get columns from it (only column names with df.keys()) Any idea on how to do this?

+8

python pandas hdf5

Graham slick Nov 07 '16 at 19:19

3

drj · Answer 1 · 2017-01-11T18:33:35+0000

Pandas HDF , HDF . fooobar.com/questions/1660117/... .

MaxU · Answer 2 · 2016-11-07T19:44:20+0000

HDF ...

, , HDF:

In [4]: fn = r'D:\temp\.data\test.h5'

In [5]: store = pd.HDFStore(fn)

In [6]: print(store)
<class 'pandas.io.pytables.HDFStore'>
File path: D:\temp\.data\test.h5
/test            frame_table  (typ->appendable,nrows->7,ncols->4,indexers->[index],dc->[Col1,Col2,Col3,Col4])

In [7]: df = store.select('test')

In [8]: df
Out[8]:
        Col1      Col2  Col3  Col4
0       what       the     0     0
1        are    curves     1     8
2        men        of     2    16
3         to      your     3    24
4      rocks      lips     4    32
5        and   rewrite     5    40
6  mountains  history.     6    48

Ivan Mitevski · Answer 3 · 2019-10-03T02:12:22+0000

Pandas - h5py, np.array, DataFrame. :

df = pd.DataFrame(np.array(h5py.File(path)['variable_1']))

Hdf5 for pandas dataframe

More articles: