Python grep command

Platform: Windows

Grep: http://gnuwin32.sourceforge.net/packages/grep.htm

Python: 2.7.2

The Windows command line used to execute commands.

I am looking for the following template "2345$" in the file. The contents of the file are as follows:

 abcd 2345 2345 abcd 2345$ 

grep "2345$" file.txt

grep returns 2 lines (first and second) successfully.

When I try to run the above command through python, I don't see any output. A snippet of Python code looks like this:

 temp = open('file.txt', "r+") grep_cmd = [] grep_cmd.extend([grep, '"2345$"' ,temp.name]) print grep_cmd p = subprocess.Popen(grep_cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) stdoutdata = p.communicate()[0] print stdoutdata 

If i have

 grep_cmd.extend([grep, '2345$' ,temp.name]) 

in my python script, I get the correct answer.

The question is, why grep with "

 grep_cmd.extend([grep, '"2345$"' ,temp.name]) 

executed from python. Is it not supposed to run the python command as is.

Thanks Gudge.

+4
source share
1 answer

Do not put double quotation marks around your template. The command line only needs to specify shell metacharacters. When calling a program from python you do not need it.

You also do not need to open the file yourself - grep will do this:

 grep_cmd.extend([grep, '2345$', 'file.txt']) 

In order to understand the reason for the absence of double quotes and cause your team to fail, you need to understand the purpose of the double quotes and their processing.

The shell uses double quotation marks to prevent special processing of certain shell metacharacters. Shell metacharacters are those characters that are specially processed by the shell and are not literally passed to the programs that it executes. The most commonly used shell metacharacter is space. The shell splits the command at the boundaries of space to create a vector of arguments to execute the program with. If you want to include a space in the argument, it must be specified in some way (single or double quotation marks, backslash, etc.). Another is the dollar sign ($), which is used to indicate the expansion of a variable.

When you run a program without a shell involved, all of these rules about citation and shell metacharacters are not relevant. In python, you yourself create a vector of arguments, so the corresponding citation rules are citation rules in python (for example, to include a double quote inside a line with two quotes, a double quote prefix with a backslash - the backslash will not be in the final line). The characters in each element of the argument vector, when you complete its construction, are literal characters that will be passed to the program that you are executing.

Grep does not treat double quotes as special characters, so if grep gets double quotes in its search pattern, it will try to match double quotes with its input.

My original reference to shell=True was wrong - at first I didn’t notice that you initially specified shell=True , and secondly, I proceeded from the point of view of implementing Unix / Linux, not Windows.

The python subprocess module page says shell=True and Windows:

On Windows: the Popen class uses CreateProcess () to execute a child of a child program that works with strings. If args is a sequence, it will be converted to a string in the manner described in Converting a Sequence of Arguments to a String in Windows .

This related section on converting a sequence of arguments to a string on Windows does not make sense to me. First, a string is a sequence as well as a list, but the Frequently Used Arguments section talks about arguments:

args is required for all calls and must be a string or sequence of program arguments. Usually, a sequence of arguments is required since it allows the module to take care of any required escaping and quoting of arguments (for example, allow spaces in file names).

This contradicts the conversion process described in the Python documentation, and given the behavior you observed, I would say that the documentation is incorrect and applies only to the argument string, not the argument. I cannot verify this myself, since I do not have Windows or Python source code.

I suspect if you call subprocess.Popen like:

 p = subprocess.Popen(grep + ' "2345$" file.txt', stdout=..., shell_True) 

you may find that double quotes are removed as part of the conversion of documented arguments.

+4
source

Source: https://habr.com/ru/post/1399496/


All Articles