Awk concatenate strings until there is a substring

I have an awk script from this example :

 awk '/START/{if (x) print x; x="";}{x=(!x)?$0:x","$0;}END{print x;}' file 

Here is an example file with lines:

 $ cat file START 1 2 3 4 5 end 6 7 START 1 2 3 end 5 6 7 

Therefore, I need to stop concatenating when the destination string contains the word end , so the desired result is:

 START,1,2,3,4,5,end START,1,2,3,end 
+5
source share
5 answers

A brief Awk solution (although it will double check the /end/ pattern):

 awk '/START/,/end/{ printf "%s%s",$0,(/^end/? ORS:",") }' file 

Output:

 START,1,2,3,4,5,end START,1,2,3,end 

  • /START/,/end/ - range template

The range template consists of two patterns separated by a comma in the form 'begpat, endpat' . It is used to match ranges of consecutive input entries. The first template, begpat , controls where the range begins, and endpat controls where the template ends.

  • /^end/? ORS:"," /^end/? ORS:"," - set the separator for the current element within the range
+8
source

here is another awk

 $ awk '/START/{ORS=","} /end/ && ORS=RS; ORS!=RS' file START,1,2,3,4,5,end START,1,2,3,end 

Note that /end/ && ORS=RS; is an abbreviated form /end/{ORS=RS; print} /end/{ORS=RS; print}

+4
source

You can use this awk :

 awk '/START/{p=1; x=""} p{x = x (x=="" ? "" : ",") $0} /end/{if (x) print x; p=0}' file START,1,2,3,4,5,end START,1,2,3,end 
+2
source

Another way, similar to the answers in How to select strings between two patterns?

 $ awk '/START/{ORS=","; f=1} /end/{ORS=RS; print; f=0} f' ip.txt START,1,2,3,4,5,end START,1,2,3,end 
  • it does not need a buffer for this, but does not check if START corresponding end
  • /START/{ORS=","; f=1} /START/{ORS=","; f=1} set ORS as , and set a flag (which controls which lines to print)
  • /end/{ORS=RS; print; f=0} /end/{ORS=RS; print; f=0} set ORS to a new line when the condition is completed. Print a line and uncheck
  • f print input record while this flag is set
+2
source

Since we seem to have gone down the rabbit hole using ways to do this, here's a pretty reasonable approach with GNU awk for multi-char RS, RT and gensub ():

 $ awk -v RS='end' -v OFS=',' 'RT{$0=gensub(/.*(START)/,"\\1",1); $NF=$NF OFS RT; print}' file START,1,2,3,4,5,end START,1,2,3,end 
0
source

Source: https://habr.com/ru/post/1274041/


All Articles