Awk concatenate strings until there is a substring

Question

Awk concatenate strings until there is a substring

I have an awk script from this example :

 awk '/START/{if (x) print x; x="";}{x=(!x)?$0:x","$0;}END{print x;}' file

Here is an example file with lines:

 $ cat file START 1 2 3 4 5 end 6 7 START 1 2 3 end 5 6 7

Therefore, I need to stop concatenating when the destination string contains the word end , so the desired result is:

 START,1,2,3,4,5,end START,1,2,3,end

+5

bash regex awk

cardinal-gray Dec 13 '17 at 15:03

source share

5 answers

here is another awk

 $ awk '/START/{ORS=","} /end/ && ORS=RS; ORS!=RS' file START,1,2,3,4,5,end START,1,2,3,end

Note that /end/ && ORS=RS; is an abbreviated form /end/{ORS=RS; print} /end/{ORS=RS; print}

+4

karakfa Dec 13 '17 at 16:09

source share

You can use this awk :

 awk '/START/{p=1; x=""} p{x = x (x=="" ? "" : ",") $0} /end/{if (x) print x; p=0}' file START,1,2,3,4,5,end START,1,2,3,end

+2

anubhava Dec 13 '17 at 15:14

source share

Another way, similar to the answers in How to select strings between two patterns?

 $ awk '/START/{ORS=","; f=1} /end/{ORS=RS; print; f=0} f' ip.txt START,1,2,3,4,5,end START,1,2,3,end

it does not need a buffer for this, but does not check if START corresponding end
/START/{ORS=","; f=1} /START/{ORS=","; f=1} set ORS as , and set a flag (which controls which lines to print)
/end/{ORS=RS; print; f=0} /end/{ORS=RS; print; f=0} set ORS to a new line when the condition is completed. Print a line and uncheck
f print input record while this flag is set

+2

Sundeep Dec 13 '17 at 16:10

source share

Since we seem to have gone down the rabbit hole using ways to do this, here's a pretty reasonable approach with GNU awk for multi-char RS, RT and gensub ():

 $ awk -v RS='end' -v OFS=',' 'RT{$0=gensub(/.*(START)/,"\\1",1); $NF=$NF OFS RT; print}' file START,1,2,3,4,5,end START,1,2,3,end

0

Ed morton Dec 13 '17 at 16:33

source share

Romanperekhrest · Accepted Answer · 2017-12-13T15:14:17+0000

A brief Awk solution (although it will double check the /end/ pattern):

 awk '/START/,/end/{ printf "%s%s",$0,(/^end/? ORS:",") }' file

Output:

 START,1,2,3,4,5,end START,1,2,3,end

/START/,/end/ - range template

The range template consists of two patterns separated by a comma in the form 'begpat, endpat' . It is used to match ranges of consecutive input entries. The first template, begpat , controls where the range begins, and endpat controls where the template ends.

/^end/? ORS:"," /^end/? ORS:"," - set the separator for the current element within the range

Awk concatenate strings until there is a substring

More articles: