I am making the problem shorter. In fact, I have data much longer.
I have a file like:
aa, bb, cc, dd, ee, 4 ff, gg, hh, ii, jj, 5 kk, ll, mm, nn, oo, 3 pp, qq, rr, ss, tt, 2 uu, vv, ww, xx, yy, 5 aa, bb, cc, dd, ee, 2
now I want to use awk to select each row with the same number in the last column and redirect it to a new file, these new files will differ depending on the number in the last column. eg. t2.txt, t3.txt, t4.txt, t5.txt will contain lines with the latest number as 2,3,4,5 respectively.
in t2.txt:
pp, qq, rr, ss, tt, 2 aa, bb, cc, dd, ee, 2
in t3.txt:
kk, ll, mm, nn, oo, 3
in t4.txt:
aa, bb, cc, dd, ee, 4
in t5.txt:
ff, gg, hh, ii, jj, 5 uu, vv, ww, xx, yy, 5
I think I need something like this:
BEGIN {FS=","} { for (n=2; n<=5; n++) if ($6 ~/\$n/) {print > "t\$n.txt"} }
But I just donβt know how to make it work.
This bash file does what I want, but the problem is that every time it extracts lines with a specific number, it should read in all lines. How can I check file TIME ONLY and extract files for all numbers?
#!/bin/bash for num in {2..5}; do gawk --assign FS="," "\$6 ~/${num}/" infile >> t${num}.txt done