Using file contents to determine MIME type using Node JS

It seems that all the popular MIME type libraries for node.js just use the file name extension rather than looking at the file to determine the MIME type.

Is there a good way to use Node to navigate to a file and intelligently identify a MIME file if the extension is missing?

+6
source share
2 answers

Indeed, it is unfortunate that most popular MIME modules simply display a type extension.

After searching deeper, I found a module called mmmagic , it seems to do exactly what you want.

Keep in mind that from working with MIME I have remained tasteful that the detection of MIME in principle is not completely reliable, and there is a rare chance of false positives.

Usage example (taken from their site):

var mmm = require('mmmagic'), Magic = mmm.Magic; var magic = new Magic(mmm.MAGIC_MIME_TYPE); magic.detectFile('node_modules/mmmagic/build/Release/magic.node', function(err, result) { if (err) throw err; console.log(result); // output on Windows with 32-bit node: // application/x-dosexec }); 
+7
source

Since MIME does not dictate the format of the contents of the file at all, you can use heuristics only to guess what is happening in the file:

  • Some binary formats have something called a magic number, but it can be incorrect or ambiguous. See this wikipedia article for more information .

  • Many text file formats contain grammatical constructs that can be used for a simple pattern matching test. For instance. xml , csv or json . However, some formats (for example, HTML ) have a rather "developed" syntactic definition, which makes it ambiguous and, therefore, it is difficult to match the template.

To better illustrate the problem of ambiguity, here is an example: browsers have developed a very high level of tolerance and accept anything that is vaguely reminiscent of HTML , so the HTML (or even XHTML ) file format is difficult to identify. Not to mention that HTML files can actually be non- HTML template languages ​​(e.g. jade , handlebars , angular templates, etc.)). This is just one of many examples where things become very mixed.

+3
source

Source: https://habr.com/ru/post/971993/


All Articles