In bash, it's usually best to use process replacement or piping

Question

In bash, it's usually best to use process replacement or piping

If you use the output of a singular command consumed by only one other, it is better to use | ( pipelines ) or <() ( process replacement )?

Better, of course, subjectively. For my specific use case, I am after execution as the main driver, but also interested in reliability.

The benefits of while read do done < <(cmd) , which I already know about, have switched to.

I have several instances of var=$(cmd1|cmd2) , which I suspect might be better replaced as var=$(cmd2 < <(cmd1)) .

I would like to know what specific benefits the last case of the first brings.

+5

bash

Ian Jan 28 '18 at 9:47

source share

1 answer

that other guy · Accepted Answer · 2018-01-29T01:40:34+0000

tl; dr: Use pipes unless you have a compelling reason.

Piping and stdin redirection from process substitution are essentially the same thing: both will lead to two processes connected by an anonymous pipe.

There are three practical differences:

1. Bash by default, a fork is created for each stage in the pipeline.

This is why you started to study this in the first place:

 #!/bin/bash cat "$1" | while IFS= read -r last; do true; done echo "Last line of $1 is $last"

This script will not work by default with pipelines, because unlike ksh and zsh , bash will have a subshell fork for each stage.

If you install shopt -s lastpipe in Bash 4.2+, Bash mimics the behavior of ksh and zsh and works fine.

2. Bash does not wait for processing to complete.

POSIX only requires a shell to wait for the last process in the pipeline, but most shells, including bash , will wait for them all.

This is noticeable if you have a slow manufacturer, for example, in the password generator /dev/random :

 tr -cd 'a-zA-Z0-9' < /dev/random | head -c 10 # Slow? head -c 10 < <(tr -cd 'a-zA-Z0-9' < /dev/random) # Fast?

The first example will not be well oriented. As soon as head is executed and exits, tr will wait for the next next call to write() to detect that the pipe is broken.

Since Bash is waiting for head and tr , it will look slower.

In the procsub version, Bash expects only head and allows tr finish in the background.

3. Bash does not currently optimize forks for simple, simple commands in process substitutions.

If you invoke an external command such as sleep 1 , then the Unix process model requires the Bash fork to execute the command.

Because forks are expensive, Bash optimizes cases where this is possible. For example, the command:

 bash -c 'sleep 1'

Two plugs would be naive: one will start bash and one will start sleep . However, Bash can optimize it because there is no need for bash stop after sleep completed, so instead of it, you can replace sleep ( execve without fork ) instead. This is very similar to tail call optimization.

( sleep 1 ) also optimized, but <( sleep 1 ) is not. The source code does not provide a special reason, so it may simply not appear.

 $ strace -f bash -c '/bin/true | /bin/true' 2>&1 | grep -c clone 2 $ strace -f bash -c '/bin/true < <(/bin/true)' 2>&1 | grep -c clone 3

Given the above, you can create a benchmark that will be better than you want, but since the number of forks in general is much more relevant, pipes will be the best default.

And, obviously, it won’t hurt that pipes are a POSIX standard, a canonical way to connect stdin / stdout of two processes and works equally well on all platforms.

In bash, it's usually best to use process replacement or piping

1. Bash by default, a fork is created for each stage in the pipeline.

2. Bash does not wait for processing to complete.

3. Bash does not currently optimize forks for simple, simple commands in process substitutions.

More articles: