Store file metadata in an additional file

I have a bunch of image files (mostly .jpg). I would like to save metadata about these files (e.g. dominant color, color distribution, maximum gradient field, points of interest, ...). These data fields are not fixed and are not available for all images.

Now I store the metadata for each file as a separate file with the same name but with a different extension. A format is just text:

metadataFieldName1 metadataFieldValue1 metadataFieldName2 metadataFieldValue2 

This makes me wonder if there is a better / easier way to store this metadata? I was thinking about ProtocolBuffer, since I need to be able to read and write this information in both C ++ and Python. But how can I support a case where some metadata is not available?

+4
source share
4 answers

I pondered this question for a long time and went with ProtocolBuffer to store metadata for my images. For each image, for example. Image00012.jpg, I store the metadata in Image00012.jpg.pbmd. When I have my .proto version of the file installed, the Python class and the C ++ class are automatically generated. It works very well and requires me to spend some time parsing (obviously better than writing a custom reader for YAML files).

RestRisiko raises a good question about how I should handle metadata inaccessible. The good thing about ProtocolBuffer is that it supports optional / required fields. This solves my problem on this front.

The reason I think XML and INI are not suitable for this purpose is because many of my metadata are complex (color distribution, ...) and require some storage setup. ProtocolBuffer allows me to write a proto declaration. In addition, the metadata file size and parsing speed are clearly superior to my manual reading / writing of XML.

0
source

I would suggest that you store such metadata inside the image files themselves.

Most image formats support metadata storage. I think .jpeg supports it through Exif .

If you are on Windows, you can use WIC to store and retrieve metadata in a unified image .

+2
source

Why protocol buffers, not XML or INI files or some text format? Just choose a format ...

What do you mean by "metadata not available"? It depends on your application to respond to such error situations ... what about storage format?

+1
source

Take a look at http://www.yaml.org . YAML is less verbose than XML and more readable.

There are YAML libraries for C ++, Python, and many other languages.

Example:

 import yaml data = { "field1" : "value1", "field2" : "value2" } serializedData = yaml.dump(data, default_flow_style=False) open("datafile", "w").write(serializedData) 
0
source

Source: https://habr.com/ru/post/1346362/


All Articles