How to use python-magic to get file type file over internet?

Usually I load it into a StringIO object and then run it:

m = magic.Magic() m.from_buffer(thefile.read(1024)) 

But this time I can’t upload the file because the image may be 20 megabytes. I want to use Python magic to search for a file type without loading the whole file .

If python-magic cannot do this ... is this the next best way to observe the mime type in headers? But how accurate is that?

I need accuracy.

+4
source share
2 answers

You can call read(1024) without downloading the whole file:

 thefile = urllib2.urlopen(someURL) 

Then just use the existing code. urlopen returns a file-like object, so this works naturally.

+7
source

If this is one of the common image formats, such as png jpg, and you see that the server is reliable, you can use the 'Content-Type' header to give what you are looking for.

But this is not as reliable as using part of the file and transferring it to python-magic, because if the server did not determine the correct format and it could set it to application / octet-stream. This is more common with video formats, but images, I think the Content-Type is fine.

Sorry, I cannot find statistics or studies on the accuracy of Content-Type. The recommended response to downloading only part of the file is also a good option.

+2
source

Source: https://habr.com/ru/post/1335370/


All Articles