How to use awk index for loop in regex

I am making the problem shorter. In fact, I have data much longer.

I have a file like:

aa, bb, cc, dd, ee, 4 ff, gg, hh, ii, jj, 5 kk, ll, mm, nn, oo, 3 pp, qq, rr, ss, tt, 2 uu, vv, ww, xx, yy, 5 aa, bb, cc, dd, ee, 2 

now I want to use awk to select each row with the same number in the last column and redirect it to a new file, these new files will differ depending on the number in the last column. eg. t2.txt, t3.txt, t4.txt, t5.txt will contain lines with the latest number as 2,3,4,5 respectively.

in t2.txt:

 pp, qq, rr, ss, tt, 2 aa, bb, cc, dd, ee, 2 

in t3.txt:

 kk, ll, mm, nn, oo, 3 

in t4.txt:

 aa, bb, cc, dd, ee, 4 

in t5.txt:

 ff, gg, hh, ii, jj, 5 uu, vv, ww, xx, yy, 5 

I think I need something like this:

 BEGIN {FS=","} { for (n=2; n<=5; n++) if ($6 ~/\$n/) {print > "t\$n.txt"} } 

But I just don’t know how to make it work.

This bash file does what I want, but the problem is that every time it extracts lines with a specific number, it should read in all lines. How can I check file TIME ONLY and extract files for all numbers?

 #!/bin/bash for num in {2..5}; do gawk --assign FS="," "\$6 ~/${num}/" infile >> t${num}.txt done 
+4
source share
2 answers

I get an answer with the following: but any further explanation would be welcome.

 BEGIN {FS=","} { for (n=1; n<=5; n++) if ($6 ~/\$n/) {print > "new"$n".txt"} } 
0
source

Try the following command:

 awk '{ print $0 > ("t" $NF ".txt") }' infile 

There is no need to change FS , because by default these are space characters. And you can temporarily access the last field with the variable NF .

NB: The concatenation of the file name string must be wrapped in parens, otherwise awk will get confused due to illegal syntax.

+5
source

Source: https://habr.com/ru/post/1402490/


All Articles