Why is the "file" command tangled in .py files?

I have some python modules that I wrote. Incidentally, I used file in this directory, and I was really surprised at what I saw. Here's the result of counting what he thought of files:

  1 ASCII Java program text, with very long lines 1 a /bin/env python script text executable 1 a python script text executable 2 ASCII C++ program text 4 ASCII English text 18 ASCII Java program text 

This is strange! Any idea what is happening or why it seems that python modules are very often java files?

I am using CentOS 5.2.

Change The question is more related to my curiosity about why explicitly non-java and non-C ++ program files are classified as such. Of course, I do not expect file be perfect, but I was surprised at the choices that were made. I would suggest that he would just give up and say a text file, and not draw very wrong conclusions.

+4
source share
3 answers

On the file man page

The file checks each argument, trying to classify it. Three sets of tests are performed in this order: file system tests, magic number tests, and language tests. The first test that succeeds causes file type printing.

I assume that some of your files coincide with tests for different languages ​​and incorrectly identify the file.

In addition, the file is usually intended for binary files, as indicated by the error section.

File

uses several algorithms that help speed up accuracy, so it can be misleading about the contents of text files.

Support for text files (mainly for programming languages) is simplified, inefficient and requires recompilation for updating.

+5
source

I just checked the test and in each case the identification was incorrect there was no shebang line.

For each file that had:

 #!/usr/bin/env python 

file correctly identified it.

Looking at the magic file, another thing that triggers recognition as a Python file is a triple quote in the first line.

 $ echo '"""' | file - /dev/stdin: python script text executable $ echo '#!/usr/bin/python' | file - /dev/stdin: python script text executable $ echo '#!/usr/bin/env python' | file - /dev/stdin: a python script text executable 
+6
source

I think the answer is that the first (no comment) word that appeared was import . This is true for all files that, in his opinion, were Java, although these were also those that were classified as text. The files he solved were C ++, starting with class . import seems to be a strong clue that the file is Java, although it is not final.

+2
source

Source: https://habr.com/ru/post/1341053/


All Articles