How to convert this <hdf5 object reference> data type to something more readable in python?

I have a fairly large dataset. All information is stored in an hdf5 file. I found h5py library for python. Everything works correctly except

[<HDF5 object reference>] 

I do not know how to transform it into something more readable. Can I do this at all? Because the documentation on this is a bit complicated for me. Perhaps there are other solutions with different languages, not just Python. I appreciate every help I get.

Ideally, it should be a link to a file.

This is part of my code:

 import numpy as np import h5py import time f = h5py.File('myfile1.mat','r') #print f.keys() test = f['db/path'] st = test[3] print( st ) 

st output [<HDF5 object reference>]

test output <HDF5 dataset "path": shape (73583, 1), type "|O8">

And I expect instead of [<HDF5 object reference>] something like this: /home/directory/file1.jpg . If it is possible of course.

+10
source share
3 answers

My friend answered my question, and I realized how easy it is. But I spent more than 4 hours solving my little problem. Decision:

 import numpy as np import h5py import time f = h5py.File('myfile1.mat','r') test = f['db/path'] st = test[0][0] obj = f[st] str1 = ''.join(chr(i) for i in obj[:]) print( str1 ) 

Sorry if I did not clarify my problem for sure. But I tried to find this solution.

+22
source

You can define your own __str__() or __repr__() for this class, or create a simple wrapper that formats a string with the information you want to see. Based on a quick look at the documentation, you can do something like

 from h5py import File class MyHDF5File (File): def __repr__ (self): return '<HDF5File({0})>'.format(self.filename) 
+2
source

Decision

Derive the class from HDF5 and overwrite the __repr__ method.

Description

When you print an object, the interpreter gives you a call to the __repr__ function on that object, which by default returns the class name and memory location of the instance.

 class Person: def __init__(self, name): self.name = name p = Person("Jhon Doe") print(p) >>> <__main__.Person object at 0x00000000022CE940> 

In your case, you have a list with only one instance of the HDF5 object. The equivalent would be:

 print([p]) >>> [<__main__.Person object at 0x000000000236E940>] 

Now you can change the way objects are displayed by __repr__ function of that class.

Note. You can also overwrite __str__ , see The difference between str and repr in Python for more details.

 class MyReadablePerson(Person): def __init__(self, name): super(MyReadablePerson, self).__init__(name) def __repr__(self): return "A person whose name is: {0}".format(self.name) p1 = MyReadablePerson("Jhon Doe") print(p1) >>> A person whos name is: Jhon Doe 
0
source

Source: https://habr.com/ru/post/982554/


All Articles