The concept of "Hold space" and "Pattern space" in sed

Question

The concept of "Hold space" and "Pattern space" in sed

I am confused by two concepts in sed: holding space and template space. Can someone help explain them?

Here is a snippet of the manual:

h H Copy/append pattern space to hold space. g G Copy/append hold space to pattern space. n N Read/append the next line of input into the pattern space.

These six teams really confuse me.

+66

linux sed

ChenQi Oct 11

source share

3 answers

@ Ed Morton: I do not agree with you here. I found that sed very useful and simple (when you understand the concept of a pattern and hold down buffers) to come up with an elegant way to do multi-line searches.

For example, let's take a text file that contains host names and some information about each host, with a lot of garbage in between, which does not bother me.

 Host: foo1 some junk, doesnt matter some junk, doesnt matter Info: about foo1 that I really care about!! some junk, doesnt matter some junk, doesnt matter Info: a second line about foo1 that I really care about!! some junk, doesnt matter some junk, doesnt matter Host: foo2 some junk, doesnt matter Info: about foo2 that I really care about!! some junk, doesnt matter some junk, doesnt matter

For me, the awk script to get the strings with the host name and the corresponding info string would take a little more than what I can do with sed:

 sed -n '/Host:/{h}; /Info/{x;p;x;p;}' myfile.txt

the output looks like this:

 Host: foo1 Info: about foo1 that I really care about!! Host: foo1 Info: a second line about foo1 that I really care about!! Host: foo2 Info: about foo2 that I really care about!!

(Note that Host: foo1 appears twice in the output.)

Explanation:

-n disable output if not explicitly printed
first match, finds and places the string Host: in the hold buffer (h)
in the second match, it finds the following line Info: but first exchanges (x) the current line in the template buffer with a hold buffer, and prints (p) the line Host: then repeatedly exchanges (x) and prints (p) Info: the line.

Yes, this is a simplified example, but I suspect that this is a common problem that a simple single-line sed quickly dealt with. For much more complex tasks, such as tasks in which you cannot rely on a given, predictable sequence, awk might be better.

+13

Jens Jensen Aug 19 '13 at 19:30

source share

Although the answer of January @ and the example are good, the explanation was not enough for me. I had to search and learn a lot until I understood how sed -n '1!G;h;$p' works. Therefore, I would like to clarify the command for someone like me.

First of all, let's see what the team does.

 $ echo {a..d} | tr ' ' '\n' # Prints from 'a' to 'd' in each line a b c d $ echo {a..d} | tr ' ' '\n' | sed -n '1!G;h;$p' d c b a

It modifies the input, as the tac command does.

sed reads line by line, so let's see what happens in the patten area and in the hold area in each line. Because the h command copies the contents of the sample space to the hold space, both spaces have the same text.

 Read line Pattern Space / Hold Space Command executed ----------------------------------------------------------- aa$ h bb\na$ 1!G;h cc\nb\na$ 1!G;h dd\nc\nb\na$ 1!G;h;$p

In the last line, $p prints d\nc\nb\na$ formatted in

 d c b a

If you want to see the pattern space for each line, you can add the l command.

 $ echo {a..d} | tr ' ' '\n' | sed -n '1!G;h;l;$p' a$ b\na$ c\nb\na$ d\nc\nb\na$ d c b a

It was very useful for me to watch this video tutorial. Understanding how sed works , as the guy shows how each space will be used step by step. The retention interval is indicated in the fourth lesson, but I recommend watching all the videos if you are not familiar with sed .

Also the GNU sed document and the Bruce Barnett Sed tutorial are very good links.

+9

Sanghyun Lee Jun 05 '17 at 19:54

source share

January · Accepted Answer · 2012-10-11 07:30

When sed reads a file line by line, the line that was currently read is inserted into the template buffer (template space). The template buffer is like a temporary buffer, a notepad in which current information is stored. When you tell sed to print, it prints a pattern buffer.

Buffer hold / hold is like long-term storage, so you can catch something, save it and reuse later when sed processes another line. You do not directly handle the hold space, instead you need to copy it or add it to the template space if you want to do something with it. For example, the print p command prints only image space. Similarly, s works on the template space.

Here is an example:

 sed -n '1!G;h;$p'

(the -n option suppresses automatic line printing)

There are three teams here: 1!G , h and $p . 1!G has an address, 1 (first line), but ! means that the command will be executed everywhere, but in the first line. $p , on the other hand, will only execute on the last line. So what happens:

the first line is read and automatically added to the template space
in the first line, the first command is not executed; h copies the first line to hold space.
now the second line replaces everything that was in the template space
in the second line, we will first execute G by adding the contents of the hold buffer to the template buffer, dividing it into a new line. Now the template space contains the second line, a new line and the first line.
The h command then inserts the concatenated contents of the template buffer into the hold space, which now contains the return lines two and one.
Go to line number three - go to point (3) above.

Finally, after the last line has been read and the hold space (containing all previous lines in reverse order) has been added to the drawing space, the drawing space is printed with p . As you might have guessed, the above does exactly what the tac command does - it prints the file in the reverse order.

The concept of "Hold space" and "Pattern space" in sed

More articles: