While Popen accepts file objects, it actually uses the basic file descriptors / descriptors, rather than the methods of reading and writing file objects for communication, as @JF Sebastian rightly points out. The best way to do this is to use a pipe ( os.pipe() ) that does not use the disk. This allows you to connect the output stream directly to the input stream of another process that you want. The problem is serialization only, to make sure the two source streams are not interleaved.
import os import subprocess r, w = os.pipe() fh_bam = open('output.bam', 'w') params_0 = [ "samtools", "view", "-HS", "header.sam" ] params_1 = [ "samtools", "view", "input.bam", "1:1-50000000"] params_2 = [ "samtools", "view", "-bS", "-" ] sub_sink = subprocess.Popen(params_2, stdin=r, stdout=fh_bam, bufsize=4096) sub_src1 = subprocess.Popen(params_0, stderr=subprocess.PIPE, stdout=w, bufsize=4096) sub_src1.communicate() sub_src2 = subprocess.Popen(params_1, stderr=subprocess.PIPE, stdout=w, bufsize=4096) sub_src2.communicate()
First, we open the sink (pipe reader), and then communicate with the source processes only to avoid potential blocking, as mentioned in @Ariel. It also causes the first source process to complete and wash out its output through the pipe before the second source process can write to the pipe, preventing interleaving / interrupted output. You can play with bufsize to tune performance.
This is almost the same as the shell command.
source share