How to convert this <hdf5 object reference> data type to something more readable in python?
I have a fairly large dataset. All information is stored in an hdf5 file. I found h5py library for python. Everything works correctly except
[<HDF5 object reference>] I do not know how to transform it into something more readable. Can I do this at all? Because the documentation on this is a bit complicated for me. Perhaps there are other solutions with different languages, not just Python. I appreciate every help I get.
Ideally, it should be a link to a file.
This is part of my code:
import numpy as np import h5py import time f = h5py.File('myfile1.mat','r') #print f.keys() test = f['db/path'] st = test[3] print( st ) st output [<HDF5 object reference>]
test output <HDF5 dataset "path": shape (73583, 1), type "|O8">
And I expect instead of [<HDF5 object reference>] something like this: /home/directory/file1.jpg . If it is possible of course.
My friend answered my question, and I realized how easy it is. But I spent more than 4 hours solving my little problem. Decision:
import numpy as np import h5py import time f = h5py.File('myfile1.mat','r') test = f['db/path'] st = test[0][0] obj = f[st] str1 = ''.join(chr(i) for i in obj[:]) print( str1 ) Sorry if I did not clarify my problem for sure. But I tried to find this solution.
You can define your own __str__() or __repr__() for this class, or create a simple wrapper that formats a string with the information you want to see. Based on a quick look at the documentation, you can do something like
from h5py import File class MyHDF5File (File): def __repr__ (self): return '<HDF5File({0})>'.format(self.filename) Decision
Derive the class from HDF5 and overwrite the __repr__ method.
Description
When you print an object, the interpreter gives you a call to the __repr__ function on that object, which by default returns the class name and memory location of the instance.
class Person: def __init__(self, name): self.name = name p = Person("Jhon Doe") print(p) >>> <__main__.Person object at 0x00000000022CE940> In your case, you have a list with only one instance of the HDF5 object. The equivalent would be:
print([p]) >>> [<__main__.Person object at 0x000000000236E940>] Now you can change the way objects are displayed by __repr__ function of that class.
Note. You can also overwrite __str__ , see The difference between str and repr in Python for more details.
class MyReadablePerson(Person): def __init__(self, name): super(MyReadablePerson, self).__init__(name) def __repr__(self): return "A person whose name is: {0}".format(self.name) p1 = MyReadablePerson("Jhon Doe") print(p1) >>> A person whos name is: Jhon Doe