The python communication () subprocess gives None when a list of numbers is expected

Question

The python communication () subprocess gives None when a list of numbers is expected

When I run the following code

from subprocess import call, check_output, Popen, PIPE gr = Popen(["grep", "'^>'", myfile], stdout=PIPE) sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout) gr.stdout.close() out = sd.communicate()[0] print out

Where myfile looks like this:

 >name len=345 sometexthere >name2 len=4523 someothertexthere ... ...

I get

 None

When the expected output is a list of numbers:

 345 4523 ... ...

The corresponding command that I run in the terminal is

 grep "^>" myfile | sed "s/.*len=//" > outfile

So far, I tried to play with escaping and quoting in different ways, for example, to strip slashes in sed or add extra quotes for grep, but the combinatorial possibilities are great there.

I also considered only reading in a file and writing Python equivalents for grep and sed, but the file is very large (I could always read in turn), it will always work on UNIX systems, and I'm still curious where I made the errors.

Could it be that

 sd.communicate()[0]

returns some kind of object (instead of a list of integers) for which None is a type?

I know that I can capture the output using check_output in simple cases:

 sam = check_output(["samn", "stats", myfile])

but not sure how to make it work with more complex situations when the material becomes available.

What are some productive approaches for getting expected results with a subprocess?

+5

python subprocess

Ekarl Dec 24 '15 at 10:21

source share

4 answers

Do not put single quotes around ^> in the grep string. This is not bash, so all arguments will be passed to the base program literally.
You need to redirect sd stdout to PIPE.

+4

Steven Dec 24 '15 at 10:31

source share

You need to redirect stdout to your second call to Popen or the output will just go to the parent process stdout and communicate will return None .

 sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout, stdout=PIPE)

+2

tdelaney Dec 24 '15 at 10:26

source share

Padraic Cunningham's answer is valid

Applying single quotes on the command line

 use shlex

.

 import shlex from subprocess import call, check_output, Popen, PIPE gr = Popen(shlex.split("grep '^>' my_file"), stdout=PIPE) sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout,stdout=PIPE) gr.stdout.close() out = sd.communicate()[0] print out

+1

repzero Dec 24 '15 at 10:48

source share

Padraic cunningham · Accepted Answer · 2015-12-24T22:40:21+0000

As suggested, you need stdout=PIPE in the second process and remove the single quotes from "'^>'" :

 gr = Popen(["grep", "^>", myfile], stdout=PIPE) Popen(["sed", "s/.*len=//"], stdin=gr.stdout, stdout=PIPE) ......

But this can be done simply using pure python and re :

 import re r = re.compile("^\>.*len=(.*)$") with open("test.txt") as f: for line in f: m = r.search(line) if m: print(m.group(1))

What will be output:

 345 4523

If the lines starting with > always have a number, and the number always ends after len= , you really don't need a regular expression either:

 with open("test.txt") as f: for line in f: if line.startswith(">"): print(line.rsplit("len=", 1)[1])

The python communication () subprocess gives None when a list of numbers is expected

More articles: