Determining whether a MIME type is binary or text

Is there a library that allows you to determine if a given type of content is binary or text?

It’s obvious that it’s text/*always textual, but for things like application/json, image/svg+xmlor even application/x-latex, it’s rather difficult without checking the actual data.

+3
source share
2 answers

There is a shell for libmagic for python - pymagic . This is the easiest way to accomplish what you want. Keep in mind that magic is as good as a fingerprint. You may have false positives if something "looks" like a different file format, but in most cases pymagic will give you what you need.

One thing to look out for is a “simple solution” to check if any of the characters are “outside” the printable ASCII range, since you are likely to come across unicode that will look like binary (and , be binary), even if it's just text content.

+2
source

, MIME, . , file(1) ( libmagic) :

> file --mime-encoding /bin/ls
/bin/ls: binary
> file --mime-encoding /etc/passwd
/etc/passwd: us-ascii
+1

Source: https://habr.com/ru/post/1768374/


All Articles